API Reference¶
This page contains the API documentation for all Python modules in the codebase (excluding init.py files).
aiperf.cli¶
Main CLI entry point for the AIPerf system.
profile(user_config, service_config=None)
¶
Run the Profile subcommand.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
user_config
|
UserConfig
|
User configuration for the benchmark |
required |
service_config
|
ServiceConfig | None
|
Service configuration options |
None
|
Source code in aiperf/cli.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | |
aiperf.cli_runner¶
run_system_controller(user_config, service_config)
¶
Run the system controller with the given configuration.
Source code in aiperf/cli_runner.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
aiperf.cli_utils¶
exit_on_error
¶
Bases: AbstractContextManager
Context manager that exits the program if an error occurs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*exceptions
|
type[BaseException]
|
The exceptions to exit on. If no exceptions are provided, all exceptions will be caught. |
()
|
message
|
RenderableType
|
The message to display. Can be a string or a rich renderable. Will be formatted with the exception as |
'{e}'
|
text_color
|
StyleType
|
The text color to use. |
'bold red'
|
title
|
str
|
The title of the error. |
'Error'
|
exit_code
|
int
|
The exit code to use. |
1
|
Source code in aiperf/cli_utils.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | |
raise_startup_error_and_exit(message, text_color='bold red', title='Error', exit_code=1, border_style='red', title_align='left')
¶
Raise a startup error and exit the program.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
RenderableType
|
The message to display. Can be a string or a rich renderable. |
required |
text_color
|
StyleType
|
The text color to use. |
'bold red'
|
title
|
str
|
The title of the error. |
'Error'
|
exit_code
|
int
|
The exit code to use. |
1
|
border_style
|
StyleType
|
The border style to use. |
'red'
|
title_align
|
AlignMethod
|
The alignment of the title. |
'left'
|
Source code in aiperf/cli_utils.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | |
warn_cancelled_early()
¶
Warn the user that the profile run was cancelled early.
Source code in aiperf/cli_utils.py
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |
warn_command_not_implemented(command)
¶
Warn the user that the subcommand is not implemented.
Source code in aiperf/cli_utils.py
14 15 16 17 18 19 | |
aiperf.clients.http.aiohttp_client¶
AioHttpClientMixin
¶
Bases: AIPerfLoggerMixin
A high-performance HTTP client for communicating with HTTP based REST APIs using aiohttp.
This class is optimized for maximum performance and accurate timing measurements, making it ideal for benchmarking scenarios.
Source code in aiperf/clients/http/aiohttp_client.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
close()
async
¶
Close the client.
Source code in aiperf/clients/http/aiohttp_client.py
49 50 51 52 53 | |
post_request(url, payload, headers, **kwargs)
async
¶
Send a streaming or non-streaming POST request to the specified URL with the given payload and headers.
If the response is an SSE stream, the response will be parsed into a list of SSE messages. Otherwise, the response will be parsed into a TextResponse object.
Source code in aiperf/clients/http/aiohttp_client.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
AioHttpSSEStreamReader
¶
A helper class for reading an SSE stream from an aiohttp.ClientResponse object.
This class is optimized for maximum performance and accurate timing measurements, making it ideal for benchmarking scenarios.
Source code in aiperf/clients/http/aiohttp_client.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 | |
__aiter__()
async
¶
Iterate over the SSE stream in a performant manner and return a tuple of the raw SSE message, the perf_counter_ns of the first byte, and the perf_counter_ns of the last byte. This provides the most accurate timing information possible without any delays due to the nature of the aiohttp library. The first byte is read immediately to capture the timestamp of the first byte, and the last byte is read after the rest of the chunk is read to capture the timestamp of the last byte.
Returns:
| Type | Description |
|---|---|
AsyncIterator[tuple[str, int]]
|
An async iterator of tuples of the raw SSE message, and the perf_counter_ns of the first byte |
Source code in aiperf/clients/http/aiohttp_client.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 | |
read_complete_stream()
async
¶
Read the complete SSE stream in a performant manner and return a list of SSE messages that contain the most accurate timestamp data possible.
Returns:
| Type | Description |
|---|---|
list[SSEMessage]
|
A list of SSE messages. |
Source code in aiperf/clients/http/aiohttp_client.py
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | |
create_tcp_connector(**kwargs)
¶
Create a new connector with the given configuration.
Source code in aiperf/clients/http/aiohttp_client.py
223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 | |
parse_sse_message(raw_message, perf_ns)
¶
Parse a raw SSE message into an SSEMessage object.
Parsing logic based on official HTML SSE Living Standard: https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream
Source code in aiperf/clients/http/aiohttp_client.py
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | |
aiperf.clients.http.defaults¶
AioHttpDefaults
dataclass
¶
Default values for aiohttp.ClientSession.
Source code in aiperf/clients/http/defaults.py
62 63 64 65 66 67 68 69 70 71 72 73 74 | |
SocketDefaults
dataclass
¶
Default values for socket options.
Source code in aiperf/clients/http/defaults.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
apply_to_socket(sock)
classmethod
¶
Apply the default socket options to the given socket.
Source code in aiperf/clients/http/defaults.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
aiperf.clients.model_endpoint_info¶
Model endpoint information.
This module contains the pydantic models that encapsulate the information needed to send requests to an inference server, primarily around the model, endpoint, and additional request payload information.
EndpointInfo
¶
Bases: AIPerfBaseModel
Information about an endpoint.
Source code in aiperf/clients/model_endpoint_info.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
from_user_config(user_config)
classmethod
¶
Create an HttpEndpointInfo from a UserConfig.
Source code in aiperf/clients/model_endpoint_info.py
102 103 104 105 106 107 108 109 110 111 112 113 114 | |
ModelEndpointInfo
¶
Bases: AIPerfBaseModel
Information about a model endpoint.
Source code in aiperf/clients/model_endpoint_info.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |
primary_model
property
¶
Get the primary model.
primary_model_name
property
¶
Get the primary model name.
url
property
¶
Get the full URL for the endpoint.
from_user_config(user_config)
classmethod
¶
Create a ModelEndpointInfo from a UserConfig.
Source code in aiperf/clients/model_endpoint_info.py
129 130 131 132 133 134 135 | |
ModelInfo
¶
Bases: AIPerfBaseModel
Information about a model.
Source code in aiperf/clients/model_endpoint_info.py
19 20 21 22 23 24 25 26 27 28 29 30 | |
ModelListInfo
¶
Bases: AIPerfBaseModel
Information about a list of models.
Source code in aiperf/clients/model_endpoint_info.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
from_user_config(user_config)
classmethod
¶
Create a ModelListInfo from a UserConfig.
Source code in aiperf/clients/model_endpoint_info.py
46 47 48 49 50 51 52 53 54 | |
aiperf.clients.openai.openai_aiohttp¶
OpenAIClientAioHttp
¶
Bases: AioHttpClientMixin, AIPerfLoggerMixin, ABC
Inference client for OpenAI based requests using aiohttp.
Source code in aiperf/clients/openai/openai_aiohttp.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | |
get_headers(model_endpoint)
¶
Get the headers for the given endpoint.
Source code in aiperf/clients/openai/openai_aiohttp.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
get_url(model_endpoint)
¶
Get the URL for the given endpoint.
Source code in aiperf/clients/openai/openai_aiohttp.py
50 51 52 53 54 55 | |
send_request(model_endpoint, payload)
async
¶
Send OpenAI request using aiohttp.
Source code in aiperf/clients/openai/openai_aiohttp.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | |
aiperf.clients.openai.openai_chat¶
OpenAIChatCompletionRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI chat completion requests.
Source code in aiperf/clients/openai/openai_chat.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for a chat completion request.
Source code in aiperf/clients/openai/openai_chat.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
aiperf.clients.openai.openai_completions¶
OpenAICompletionRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI completion requests.
Source code in aiperf/clients/openai/openai_completions.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for a completion request.
Source code in aiperf/clients/openai/openai_completions.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
aiperf.clients.openai.openai_embeddings¶
OpenAIEmbeddingsRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI embeddings requests.
Source code in aiperf/clients/openai/openai_embeddings.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for an embeddings request.
Source code in aiperf/clients/openai/openai_embeddings.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
aiperf.clients.openai.openai_responses¶
OpenAIResponsesRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI Responses requests.
Source code in aiperf/clients/openai/openai_responses.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for a responses request.
Source code in aiperf/clients/openai/openai_responses.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
aiperf.common.aiperf_logger¶
AIPerfLogger
¶
Logger for AIPerf messages with lazy evaluation support for f-strings.
This logger supports lazy evaluation of f-strings through lambdas to avoid expensive string formatting operations when the log level is not enabled.
It also extends the standard logging module with additional log levels
- TRACE (TRACE < DEBUG)
- NOTICE (INFO < NOTICE < WARNING)
- SUCCESS (WARNING < SUCCESS < ERROR)
Usage
logger = AIPerfLogger("my_logger") logger.debug(lambda: f"Processing {item} with {count} items") logger.info("Simple string message") logger.notice("Notice message") logger.success("Benchmark completed successfully")
Need to pass local variables to the lambda to avoid them going out of scope¶
logger.debug(lambda i=i: f"Binding loop variable: {i}") logger.exception(f"Direct f-string usage: {e}")
Source code in aiperf/common/aiperf_logger.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 | |
critical(msg, *args, **kwargs)
¶
Log a critical message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
210 211 212 213 | |
debug(msg, *args, **kwargs)
¶
Log a debug message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
175 176 177 178 | |
error(msg, *args, **kwargs)
¶
Log an error message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
200 201 202 203 | |
exception(msg, *args, **kwargs)
¶
Log an exception message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
205 206 207 208 | |
find_caller(stack_info=False, stacklevel=1)
¶
NOTE: This is a modified version of the findCaller method in the logging module, in order to allow us to add custom ignored files.
Find the stack frame of the caller so that we can note the source file name, line number and function name.
Source code in aiperf/common/aiperf_logger.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | |
get_level_number(level)
classmethod
¶
Get the numeric level for the given level.
Source code in aiperf/common/aiperf_logger.py
114 115 116 117 118 119 120 | |
info(msg, *args, **kwargs)
¶
Log an info message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
180 181 182 183 | |
is_valid_level(level)
classmethod
¶
Check if the given level is a valid level.
Source code in aiperf/common/aiperf_logger.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | |
log(level, msg, *args, **kwargs)
¶
Log a message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
165 166 167 168 | |
notice(msg, *args, **kwargs)
¶
Log a notice message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
185 186 187 188 | |
success(msg, *args, **kwargs)
¶
Log a success message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
195 196 197 198 | |
trace(msg, *args, **kwargs)
¶
Log a trace message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
170 171 172 173 | |
trace_or_debug(trace_msg, debug_msg)
¶
Log different messages depending on the level of the logger.
This method is used to log a message at the trace level if the trace level is enabled, otherwise it will log a debug message. It enables us to use a single method to log different messages depending on the level of the logger. Use this method to provide full dumps of data when the logger is in trace mode, and a more concise message when the logger is in debug mode.
Example:
self.trace_or_debug(
lambda: f"Received request: {request}",
lambda: f"Received request id: {request.id}",
)
Source code in aiperf/common/aiperf_logger.py
215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 | |
warning(msg, *args, **kwargs)
¶
Log a warning message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
190 191 192 193 | |
aiperf.common.base_comms¶
BaseCommunication
¶
Bases: AIPerfLifecycleMixin, ABC
Base class for specifying the base communication layer for AIPerf components.
Source code in aiperf/common/base_comms.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
create_client(client_type, address, bind=False, socket_ops=None, max_pull_concurrency=None)
abstractmethod
¶
Create a communication client for a given client type and address.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client_type
|
CommClientType
|
The type of client to create. |
required |
address
|
CommAddressType
|
The type of address to use when looking up in the communication config, or the address itself. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
False
|
socket_ops
|
dict | None
|
Additional socket options to set. |
None
|
max_pull_concurrency
|
int | None
|
The maximum number of concurrent pull requests to allow. (Only used for pull clients) |
None
|
Source code in aiperf/common/base_comms.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
get_address(address_type)
abstractmethod
¶
Get the address for a given address type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address_type
|
CommAddressType
|
The type of address to get the address for, or the address itself. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The address for the given address type, or the address itself if it is a string. |
Source code in aiperf/common/base_comms.py
26 27 28 29 30 31 32 33 34 35 | |
aiperf.common.base_component_service¶
BaseComponentService
¶
Bases: BaseService
Base class for all Component services.
This class provides a common interface for all Component services in the AIPerf framework such as the Timing Manager, Dataset Manager, etc.
It extends the BaseService by adding heartbeat and registration functionality, as well as publishing the current state of the service to the system controller.
Source code in aiperf/common/base_component_service.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | |
aiperf.common.base_service¶
BaseService
¶
Bases: CommandHandlerMixin, ABC
Base class for all AIPerf services, providing common functionality for communication, state management, and lifecycle operations. This class inherits from the MessageBusClientMixin, which provides the message bus client functionality.
This class provides the foundation for implementing the various services of the AIPerf system. Some of the abstract methods are implemented here, while others are still required to be implemented by derived classes.
Source code in aiperf/common/base_service.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
service_type
class-attribute
¶
The type of service this class implements. This is set by the ServiceFactory.register decorator.
stop()
async
¶
This overrides the base class stop method to handle the case where the service is already stopping. In this case, we need to kill the process to be safe.
Source code in aiperf/common/base_service.py
88 89 90 91 92 93 94 95 | |
aiperf.common.bootstrap¶
bootstrap_and_run_service(service_class, service_config=None, user_config=None, service_id=None, log_queue=None, **kwargs)
¶
Bootstrap the service and run it.
This function will load the service configuration, create an instance of the service, and run it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
service_class
|
type[ServiceProtocol]
|
The python class of the service to run. This should be a subclass of BaseService. This should be a type and not an instance. |
required |
service_config
|
ServiceConfig | None
|
The service configuration to use. If not provided, the service configuration will be loaded from the environment variables. |
None
|
user_config
|
UserConfig | None
|
The user configuration to use. If not provided, the user configuration will be loaded from the environment variables. |
None
|
log_queue
|
Queue | None
|
Optional multiprocessing queue for child process logging. If provided, the child process logging will be set up. |
None
|
kwargs
|
Additional keyword arguments to pass to the service constructor. |
{}
|
Source code in aiperf/common/bootstrap.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
aiperf.common.config.audio_config¶
AudioConfig
¶
Bases: BaseConfig
A configuration class for defining audio related settings.
Source code in aiperf/common/config/audio_config.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
AudioLengthConfig
¶
Bases: BaseConfig
A configuration class for defining audio length related settings.
Source code in aiperf/common/config/audio_config.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.config.base_config¶
BaseConfig
¶
Bases: BaseModel
Base configuration class for all configurations.
Source code in aiperf/common/config/base_config.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |
serialize_to_yaml(verbose=False, indent=4)
¶
Serialize a Pydantic model to a YAML string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
verbose
|
bool
|
Whether to include verbose comments in the YAML output. |
False
|
indent
|
int
|
The per-level indentation to use. |
4
|
Source code in aiperf/common/config/base_config.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.config.config_defaults¶
aiperf.common.config.config_validators¶
parse_file(value)
¶
Parses the given string value and returns a Path object if the value represents a valid file or directory. Returns None if the input value is empty. Args: value (str): The string value to parse. Returns: Optional[Path]: A Path object if the value is valid, or None if the value is empty. Raises: ValueError: If the value is not a valid file or directory.
Source code in aiperf/common/config/config_validators.py
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 | |
parse_service_types(input)
¶
Parses the input to ensure it is a set of service types. Will replace hyphens with underscores for user convenience.
Source code in aiperf/common/config/config_validators.py
73 74 75 76 77 78 79 80 81 82 | |
parse_str_or_csv_list(input)
¶
Parses the input to ensure it is either a string or a list. If the input is a string, it splits the string by commas and trims any whitespace around each element, returning the result as a list. If the input is already a list, it will split each item by commas and trim any whitespace around each element, returning the combined result as a list. If the input is neither a string nor a list, a ValueError is raised.
[1, 2, 3] -> [1, 2, 3] "1,2,3" -> ["1", "2", "3"]["1,2,3", "4,5,6"] -> ["1", "2", "3", "4", "5", "6"]["1,2,3", 4, 5] -> ["1", "2", "3", 4, 5]
Source code in aiperf/common/config/config_validators.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
parse_str_or_dict(input)
¶
Parses the input to ensure it is a dictionary.
- If the input is a string:
- If the string starts with a '{', it is parsed as a JSON string.
- Otherwise, it splits the string by commas and then for each item, it splits the item by colons into key and value, trims any whitespace.
- If the input is already a dictionary, it is returned as-is.
- If the input is a list, it is converted to a dictionary by splitting each string by colons into key and value, trims any whitespace.
- Otherwise, a ValueError is raised.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input
|
Any
|
The input to be parsed. Expected to be a string, list, or dictionary. |
required |
Returns: dict[str, Any]: A dictionary derived from the input. Raises: ValueError: If the input is neither a string, list, nor dictionary, or if the parsing fails.
Source code in aiperf/common/config/config_validators.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
parse_str_or_list(input)
¶
Parses the input to ensure it is either a string or a list. If the input is a string, it splits the string by commas and trims any whitespace around each element, returning the result as a list. If the input is already a list, it is returned as-is. If the input is neither a string nor a list, a ValueError is raised. Args: input (Any): The input to be parsed. Expected to be a string or a list. Returns: list: A list of strings derived from the input. Raises: ValueError: If the input is neither a string nor a list.
Source code in aiperf/common/config/config_validators.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
parse_str_or_list_of_positive_values(input)
¶
Parses the input to ensure it is a list of positive integers or floats.
This function first converts the input into a list using parse_str_or_list.
It then validates that each value in the list is either an integer or a float
and that all values are strictly greater than zero. If any value fails this
validation, a ValueError is raised.
Args:
input (Any): The input to be parsed. It can be a string or a list.
Returns:
List[Any]: A list of positive integers or floats.
Raises:
ValueError: If any value in the parsed list is not a positive integer or float.
Source code in aiperf/common/config/config_validators.py
145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 | |
aiperf.common.config.conversation_config¶
ConversationConfig
¶
Bases: BaseConfig
A configuration class for defining conversations related settings.
Source code in aiperf/common/config/conversation_config.py
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 | |
TurnConfig
¶
Bases: BaseConfig
A configuration class for defining turn related settings in a conversation.
Source code in aiperf/common/config/conversation_config.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | |
TurnDelayConfig
¶
Bases: BaseConfig
A configuration class for defining turn delay related settings.
Source code in aiperf/common/config/conversation_config.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
aiperf.common.config.endpoint_config¶
EndpointConfig
¶
Bases: BaseConfig
A configuration class for defining endpoint related settings.
Source code in aiperf/common/config/endpoint_config.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | |
aiperf.common.config.groups¶
Groups
¶
Groups for the CLI.
NOTE: The order of these groups are the order they will be displayed in the help text.
Source code in aiperf/common/config/groups.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | |
aiperf.common.config.image_config¶
ImageConfig
¶
Bases: BaseConfig
A configuration class for defining image related settings.
Source code in aiperf/common/config/image_config.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | |
ImageHeightConfig
¶
Bases: BaseConfig
A configuration class for defining image height related settings.
Source code in aiperf/common/config/image_config.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
ImageWidthConfig
¶
Bases: BaseConfig
A configuration class for defining image width related settings.
Source code in aiperf/common/config/image_config.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
aiperf.common.config.input_config¶
InputConfig
¶
Bases: BaseConfig
A configuration class for defining input related settings.
Source code in aiperf/common/config/input_config.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 | |
validate_fixed_schedule()
¶
Validate the fixed schedule configuration.
Source code in aiperf/common/config/input_config.py
34 35 36 37 38 39 40 41 42 | |
validate_fixed_schedule_start_and_end_offset()
¶
Validate the fixed schedule start and end offset configuration.
Source code in aiperf/common/config/input_config.py
56 57 58 59 60 61 62 63 64 65 66 67 | |
validate_fixed_schedule_start_offset()
¶
Validate the fixed schedule start offset configuration.
Source code in aiperf/common/config/input_config.py
44 45 46 47 48 49 50 51 52 53 54 | |
aiperf.common.config.loader¶
load_service_config()
¶
Load the service configuration.
Source code in aiperf/common/config/loader.py
7 8 9 10 | |
load_user_config()
¶
Load the user configuration.
Source code in aiperf/common/config/loader.py
13 14 15 16 | |
aiperf.common.config.loadgen_config¶
LoadGeneratorConfig
¶
Bases: BaseConfig
A configuration class for defining top-level load generator settings.
Source code in aiperf/common/config/loadgen_config.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 | |
aiperf.common.config.output_config¶
OutputConfig
¶
Bases: BaseConfig
A configuration class for defining output related settings.
Source code in aiperf/common/config/output_config.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
aiperf.common.config.prompt_config¶
InputTokensConfig
¶
Bases: BaseConfig
A configuration class for defining input token related settings.
Source code in aiperf/common/config/prompt_config.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
OutputTokensConfig
¶
Bases: BaseConfig
A configuration class for defining output token related settings.
Source code in aiperf/common/config/prompt_config.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
PrefixPromptConfig
¶
Bases: BaseConfig
A configuration class for defining prefix prompt related settings.
Source code in aiperf/common/config/prompt_config.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | |
PromptConfig
¶
Bases: BaseConfig
A configuration class for defining prompt related settings.
Source code in aiperf/common/config/prompt_config.py
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 | |
aiperf.common.config.service_config¶
ServiceConfig
¶
Bases: BaseSettings
Base configuration for all services. It will be provided to all services during their init function.
Source code in aiperf/common/config/service_config.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 | |
validate_comm_config()
¶
Initialize the comm_config if it is not provided, based on the comm_backend.
Source code in aiperf/common/config/service_config.py
50 51 52 53 54 55 56 57 58 59 60 | |
validate_log_level_from_verbose_flags()
¶
Set log level based on verbose flags.
Source code in aiperf/common/config/service_config.py
41 42 43 44 45 46 47 48 | |
aiperf.common.config.tokenizer_config¶
TokenizerConfig
¶
Bases: BaseConfig
A configuration class for defining tokenizer related settings.
Source code in aiperf/common/config/tokenizer_config.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | |
aiperf.common.config.user_config¶
UserConfig
¶
Bases: BaseConfig
A configuration class for defining top-level user settings.
Source code in aiperf/common/config/user_config.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
aiperf.common.config.worker_config¶
WorkersConfig
¶
Bases: BaseConfig
Worker configuration.
Source code in aiperf/common/config/worker_config.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
aiperf.common.config.zmq_config¶
BaseZMQCommunicationConfig
¶
Bases: BaseModel, ABC
Configuration for ZMQ communication.
Source code in aiperf/common/config/zmq_config.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
credit_drop_address
abstractmethod
property
¶
Get the credit drop address based on protocol configuration.
credit_return_address
abstractmethod
property
¶
Get the credit return address based on protocol configuration.
records_push_pull_address
abstractmethod
property
¶
Get the inference push/pull address based on protocol configuration.
get_address(address_type)
¶
Get the actual address based on the address type.
Source code in aiperf/common/config/zmq_config.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
BaseZMQProxyConfig
¶
Bases: BaseModel, ABC
Configuration Protocol for ZMQ Proxy.
Source code in aiperf/common/config/zmq_config.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | |
backend_address
abstractmethod
property
¶
Get the backend address based on protocol configuration.
capture_address
abstractmethod
property
¶
Get the capture address based on protocol configuration.
control_address
abstractmethod
property
¶
Get the control address based on protocol configuration.
frontend_address
abstractmethod
property
¶
Get the frontend address based on protocol configuration.
ZMQIPCConfig
¶
Bases: BaseZMQCommunicationConfig
Configuration for IPC transport.
Source code in aiperf/common/config/zmq_config.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 | |
ZMQIPCProxyConfig
¶
Bases: BaseZMQProxyConfig
Configuration for IPC proxy.
Source code in aiperf/common/config/zmq_config.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | |
backend_address
property
¶
Get the backend address based on protocol configuration.
capture_address
property
¶
Get the capture address based on protocol configuration.
control_address
property
¶
Get the control address based on protocol configuration.
frontend_address
property
¶
Get the frontend address based on protocol configuration.
ZMQTCPConfig
¶
Bases: BaseZMQCommunicationConfig
Configuration for TCP transport.
Source code in aiperf/common/config/zmq_config.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | |
ZMQTCPProxyConfig
¶
Bases: BaseZMQProxyConfig
Configuration for TCP proxy.
Source code in aiperf/common/config/zmq_config.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
backend_address
property
¶
Get the backend address based on protocol configuration.
capture_address
property
¶
Get the capture address based on protocol configuration.
control_address
property
¶
Get the control address based on protocol configuration.
frontend_address
property
¶
Get the frontend address based on protocol configuration.
aiperf.common.constants¶
DEFAULT_COMMAND_RESPONSE_TIMEOUT = 30.0
module-attribute
¶
Default timeout for command responses in seconds.
DEFAULT_COMMS_REQUEST_TIMEOUT = 30.0
module-attribute
¶
Default timeout for requests from req_clients to rep_clients in seconds.
DEFAULT_CONNECTION_PROBE_INTERVAL = 0.1
module-attribute
¶
Default interval for connection probes in seconds until a response is received.
DEFAULT_CONNECTION_PROBE_TIMEOUT = 30.0
module-attribute
¶
Maximum amount of time to wait for connection probe response.
DEFAULT_MAX_REGISTRATION_ATTEMPTS = 10
module-attribute
¶
Default maximum number of registration attempts for component services before giving up.
DEFAULT_PROFILE_CANCEL_TIMEOUT = 10.0
module-attribute
¶
Default timeout for cancelling a profile run in seconds.
DEFAULT_PROFILE_CONFIGURE_TIMEOUT = 300.0
module-attribute
¶
Default timeout for profile configure command in seconds.
DEFAULT_PROFILE_START_TIMEOUT = 60.0
module-attribute
¶
Default timeout for profile start command in seconds.
DEFAULT_PULL_CLIENT_MAX_CONCURRENCY = 100000
module-attribute
¶
Default maximum concurrency for pull clients.
DEFAULT_REGISTRATION_INTERVAL = 1.0
module-attribute
¶
Default interval between registration attempts in seconds for component services.
DEFAULT_SERVICE_REGISTRATION_TIMEOUT = 30.0
module-attribute
¶
Default timeout for service registration in seconds.
DEFAULT_SERVICE_START_TIMEOUT = 30.0
module-attribute
¶
Default timeout for service start in seconds.
DEFAULT_SHUTDOWN_ACK_TIMEOUT = 5.0
module-attribute
¶
Default timeout for waiting for a shutdown command response in seconds.
DEFAULT_UI_MIN_UPDATE_PERCENT = 1.0
module-attribute
¶
Default minimum percentage difference from the last update to trigger a UI update (for non-dashboard UIs).
DEFAULT_WORKER_CHECK_INTERVAL = 1.0
module-attribute
¶
Default interval between worker checks in seconds for the WorkerManager.
DEFAULT_WORKER_ERROR_RECOVERY_TIME = 3.0
module-attribute
¶
Default time in seconds from the last time a worker had an error before it is considered healthy again.
DEFAULT_WORKER_HIGH_LOAD_CPU_USAGE = 75.0
module-attribute
¶
Default CPU usage threshold for a worker to be considered high load.
DEFAULT_WORKER_HIGH_LOAD_RECOVERY_TIME = 5.0
module-attribute
¶
Default time in seconds from the last time a worker was in high load before it is considered healthy again.
DEFAULT_WORKER_STALE_TIME = 10.0
module-attribute
¶
Default time in seconds from the last time a worker reported any status before it is considered stale.
DEFAULT_WORKER_STATUS_SUMMARY_INTERVAL = 0.5
module-attribute
¶
Default interval in seconds between worker status summary messages.
GRACEFUL_SHUTDOWN_TIMEOUT_SECONDS = 5.0
module-attribute
¶
Default timeout for shutting down services in seconds.
TASK_CANCEL_TIMEOUT_LONG = 5.0
module-attribute
¶
Maximum time to wait for complex tasks to complete when cancelling them (like parent tasks).
TASK_CANCEL_TIMEOUT_SHORT = 2.0
module-attribute
¶
Maximum time to wait for simple tasks to complete when cancelling them.
aiperf.common.decorators¶
Decorators for AIPerf components. Note that these are not the same as hooks. Hooks are used to specify that a function should be called at a specific time, while decorators are used to specify that a class or function should be treated a specific way.
see also: :mod:aiperf.common.hooks for hook decorators.
DecoratorAttrs
¶
Constant attribute names for decorators.
When you decorate a class with a decorator, the decorator type and parameters are set as attributes on the class.
Source code in aiperf/common/decorators.py
17 18 19 20 21 22 23 24 | |
implements_protocol(protocol)
¶
Decorator to specify that the class implements the given protocol.
Example:
@implements_protocol(ServiceProtocol)
class BaseService:
pass
The above is the equivalent to setting:
BaseService.__implements_protocol__ = ServiceProtocol
Source code in aiperf/common/decorators.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
aiperf.common.enums.base_enums¶
BasePydanticBackedStrEnum
¶
Bases: CaseInsensitiveStrEnum
Custom enumeration class that extends CaseInsensitiveStrEnum
and is backed by a BasePydanticEnumInfo that contains the tag, and any other information that is needed
to represent the enum member.
Source code in aiperf/common/enums/base_enums.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | |
info
cached
property
¶
Get the enum info for the enum member.
BasePydanticEnumInfo
¶
Bases: BaseModel
Base class for all enum info classes that extend BasePydanticBackedStrEnum. By default, it
provides a tag for the enum member, which is used for lookup and string comparison,
and the subclass can provide additional information as needed.
Source code in aiperf/common/enums/base_enums.py
52 53 54 55 56 57 58 59 60 61 62 63 64 | |
CaseInsensitiveStrEnum
¶
Bases: str, Enum
CaseInsensitiveStrEnum is a custom enumeration class that extends str and Enum to provide case-insensitive
lookup functionality for its members.
Source code in aiperf/common/enums/base_enums.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.enums.command_enums¶
aiperf.common.enums.communication_enums¶
CommAddress
¶
Bases: CaseInsensitiveStrEnum
Enum for specifying the address type for communication clients. This is used to lookup the address in the communication config.
Source code in aiperf/common/enums/communication_enums.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
CREDIT_DROP = 'credit_drop'
class-attribute
instance-attribute
¶
Address to send CreditDrop messages from the TimingManager to the Worker.
CREDIT_RETURN = 'credit_return'
class-attribute
instance-attribute
¶
Address to send CreditReturn messages from the Worker to the TimingManager.
DATASET_MANAGER_PROXY_BACKEND = 'dataset_manager_proxy_backend'
class-attribute
instance-attribute
¶
Backend address for the DatasetManager to receive requests from clients.
DATASET_MANAGER_PROXY_FRONTEND = 'dataset_manager_proxy_frontend'
class-attribute
instance-attribute
¶
Frontend address for sending requests to the DatasetManager.
EVENT_BUS_PROXY_BACKEND = 'event_bus_proxy_backend'
class-attribute
instance-attribute
¶
Backend address for services to subscribe to messages.
EVENT_BUS_PROXY_FRONTEND = 'event_bus_proxy_frontend'
class-attribute
instance-attribute
¶
Frontend address for services to publish messages to.
RAW_INFERENCE_PROXY_BACKEND = 'raw_inference_proxy_backend'
class-attribute
instance-attribute
¶
Backend address for the InferenceParser to receive raw inference messages from Workers.
RAW_INFERENCE_PROXY_FRONTEND = 'raw_inference_proxy_frontend'
class-attribute
instance-attribute
¶
Frontend address for sending raw inference messages to the InferenceParser from Workers.
RECORDS = 'records'
class-attribute
instance-attribute
¶
Address to send parsed records from InferenceParser to RecordManager.
aiperf.common.enums.data_exporter_enums¶
aiperf.common.enums.dataset_enums¶
aiperf.common.enums.endpoints_enums¶
EndpointType
¶
Bases: BasePydanticBackedStrEnum
Endpoint types supported by AIPerf.
These are the full definitions of the endpoints that are supported by AIPerf.
Each enum value contains additional metadata about the endpoint, such as whether it supports streaming,
produces tokens, and the default endpoint path. This is stored as an attribute on the enum value, and can be accessed
via the info property. The enum values can still be used as strings for user input and comparison (via the tag field).
Source code in aiperf/common/enums/endpoints_enums.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | |
endpoint_path
property
¶
Get the default endpoint path for the endpoint type. If None, the endpoint does not have a specific path.
info
cached
property
¶
Get the endpoint info for the endpoint type.
metrics_title
property
¶
Get the metrics table title string for the endpoint type. If None, the default title is used.
produces_tokens
property
¶
Return True if the endpoint produces tokens. This is used to determine what metrics are applicable to the endpoint.
supports_audio
property
¶
Return True if the endpoint supports audio input. This is used to determine what metrics are applicable to the endpoint, as well as what inputs can be used.
supports_images
property
¶
Return True if the endpoint supports image input. This is used to determine what metrics are applicable to the endpoint, as well as what inputs can be used.
supports_streaming
property
¶
Return True if the endpoint supports streaming. This is used for validation of user input.
EndpointTypeInfo
¶
Bases: BasePydanticEnumInfo
Pydantic model for endpoint-specific metadata. This model is used to store additional info on each EndpointType enum value.
For documentation on the fields, see the :class:EndpointType @property functions.
Source code in aiperf/common/enums/endpoints_enums.py
14 15 16 17 18 19 20 21 22 23 24 25 | |
aiperf.common.enums.logging_enums¶
aiperf.common.enums.media_enums¶
MediaType
¶
Bases: CaseInsensitiveStrEnum
The various types of media (e.g. text, image, audio).
Source code in aiperf/common/enums/media_enums.py
7 8 9 10 11 12 | |
aiperf.common.enums.message_enums¶
MessageType
¶
Bases: CaseInsensitiveStrEnum
The various types of messages that can be sent between services.
The message type is used to determine what Pydantic model the message maps to,
based on the message_type field in the message model. For detailed explanations
of each message type, go to its definition in :mod:aiperf.common.messages.
Source code in aiperf/common/enums/message_enums.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
aiperf.common.enums.metric_enums¶
BaseMetricUnit
¶
Bases: BasePydanticBackedStrEnum
Base class for all metric units.
Source code in aiperf/common/enums/metric_enums.py
45 46 47 48 49 50 51 52 53 54 55 | |
info
cached
property
¶
Get the info for the metric unit.
convert_to(other_unit, value)
¶
Convert a value from this unit to another unit. This is a passthrough to the info class.
Source code in aiperf/common/enums/metric_enums.py
53 54 55 | |
BaseMetricUnitInfo
¶
Bases: BasePydanticEnumInfo
Base class for all metric units. Provides a base implementation for converting between units which can be overridden by subclasses to support more complex conversions.
Source code in aiperf/common/enums/metric_enums.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
convert_to(other_unit, value)
¶
Convert a value from this unit to another unit.
Source code in aiperf/common/enums/metric_enums.py
32 33 34 35 36 37 38 39 40 41 42 | |
GenericMetricUnit
¶
Bases: BaseMetricUnit
Defines generic units for metrics. These dont have any extra information other than the tag, which is used for display purposes.
Source code in aiperf/common/enums/metric_enums.py
187 188 189 190 191 192 193 | |
MetricDateTimeUnit
¶
Bases: BaseMetricUnit
Defines the various date time units that can be used for metrics.
Source code in aiperf/common/enums/metric_enums.py
196 197 198 199 | |
MetricFlags
¶
Bases: Flag
Defines the possible flags for metrics that are used to determine how they are processed or grouped. These flags are intended to be an easy way to group metrics, or turn on/off certain features.
Note that the flags are a bitmask, so they can be combined using the bitwise OR operator (|).
For example, to create a flag that is both STREAMING_ONLY and HIDDEN, you can do:
MetricFlags.STREAMING_ONLY | MetricFlags.HIDDEN
To check if a metric has a flag, you can use the has_flags method.
For example, to check if a metric has both the STREAMING_ONLY and HIDDEN flags, you can do:
metric.has_flags(MetricFlags.STREAMING_ONLY | MetricFlags.HIDDEN)
To check if a metric does not have a flag(s), you can use the missing_flags method.
For example, to check if a metric does not have either the STREAMING_ONLY or HIDDEN flags, you can do:
metric.missing_flags(MetricFlags.STREAMING_ONLY | MetricFlags.HIDDEN)
Source code in aiperf/common/enums/metric_enums.py
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 | |
ERROR_ONLY = 1 << 1
class-attribute
instance-attribute
¶
Metrics that are only applicable to error records. By default, metrics are only computed if the record is valid. If this flag is set, the metric will only be computed if the record is invalid.
HIDDEN = 1 << 3
class-attribute
instance-attribute
¶
Metrics that should not be displayed in the UI.
INTERNAL = 1 << 5 | HIDDEN
class-attribute
instance-attribute
¶
Metrics that are internal to the system and not applicable to the user. This inherently means that the metric is HIDDEN as well.
LARGER_IS_BETTER = 1 << 4
class-attribute
instance-attribute
¶
Metrics that are better when the value is larger. By default, it is assumed that metrics are better when the value is smaller.
NONE = 0
class-attribute
instance-attribute
¶
No flags.
PRODUCES_TOKENS_ONLY = 1 << 2
class-attribute
instance-attribute
¶
Metrics that are only applicable when profiling an endpoint that produces tokens.
STREAMING_ONLY = 1 << 0
class-attribute
instance-attribute
¶
Metrics that are only applicable to streamed responses.
STREAMING_TOKENS_ONLY = STREAMING_ONLY | PRODUCES_TOKENS_ONLY
class-attribute
instance-attribute
¶
Metrics that are only applicable to streamed responses and token-based endpoints.
This is a convenience flag that is the combination of the STREAMING_ONLY and PRODUCES_TOKENS_ONLY flags.
SUPPORTS_AUDIO_ONLY = 1 << 6
class-attribute
instance-attribute
¶
Metrics that are only applicable to audio-based endpoints.
SUPPORTS_IMAGE_ONLY = 1 << 7
class-attribute
instance-attribute
¶
Metrics that are only applicable to image-based endpoints.
has_flags(flags)
¶
Return True if the metric has ALL of the given flag(s) (regardless of other flags).
Source code in aiperf/common/enums/metric_enums.py
422 423 424 425 | |
missing_flags(flags)
¶
Return True if the metric does not have ANY of the given flag(s) (regardless of other flags). It will return False if the metric has ANY of the given flags. If the input flags are NONE, it will return True.
Source code in aiperf/common/enums/metric_enums.py
427 428 429 430 431 432 433 434 435 | |
MetricOverTimeUnit
¶
Bases: BaseMetricUnit
Defines the units for metrics that are a generic unit over a specific time unit.
Source code in aiperf/common/enums/metric_enums.py
245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 | |
MetricOverTimeUnitInfo
¶
Bases: BaseMetricUnitInfo
Information about a metric over time unit.
Source code in aiperf/common/enums/metric_enums.py
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 | |
convert_to(other_unit, value)
¶
Convert a value from this unit to another unit.
Source code in aiperf/common/enums/metric_enums.py
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 | |
MetricSizeUnit
¶
Bases: BaseMetricUnit
Defines the size types for metrics.
Source code in aiperf/common/enums/metric_enums.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | |
MetricSizeUnitInfo
¶
Bases: BaseMetricUnitInfo
Information about a size unit for metrics.
Source code in aiperf/common/enums/metric_enums.py
62 63 64 65 66 67 68 69 70 71 72 73 | |
convert_to(other_unit, value)
¶
Convert a value from this unit to another unit.
Source code in aiperf/common/enums/metric_enums.py
68 69 70 71 72 73 | |
MetricTimeUnit
¶
Bases: BaseMetricUnit
Defines the various time units that can be used for metrics, as well as the conversion factor to convert to other units.
Source code in aiperf/common/enums/metric_enums.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 | |
info
cached
property
¶
Get the info for the metric time unit.
long_name
cached
property
¶
The long name of the metric time unit.
per_second
cached
property
¶
How many of these units there are in one second. Used as a common conversion factor to convert to other units.
convert_to(other_unit, value)
¶
Convert a value from this unit to another unit.
Source code in aiperf/common/enums/metric_enums.py
167 168 169 170 171 172 173 174 175 176 177 178 179 | |
MetricTimeUnitInfo
¶
Bases: BaseMetricUnitInfo
Information about a time unit for metrics.
Source code in aiperf/common/enums/metric_enums.py
121 122 123 124 125 | |
MetricType
¶
Bases: CaseInsensitiveStrEnum
Defines the possible types of metrics.
Source code in aiperf/common/enums/metric_enums.py
283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 | |
AGGREGATE = 'aggregate'
class-attribute
instance-attribute
¶
Metrics that keep track of one or more values over time, that are updated for each request, such as total counts, min/max values, etc. These metrics may or may not change each request, and are affected by other requests. Examples: min/max request latency, total request count, benchmark duration, etc.
DERIVED = 'derived'
class-attribute
instance-attribute
¶
Metrics that are purely derived from other metrics as a summary, and do not require per-request values. Examples: request throughput, output token throughput, etc.
RECORD = 'record'
class-attribute
instance-attribute
¶
Metrics that provide a distinct value for each request. Every request that comes in will produce a new value that is not affected by any other requests. These metrics can be tracked over time and compared to each other. Examples: request latency, ISL, ITL, OSL, etc.
MetricValueType
¶
Bases: BasePydanticBackedStrEnum
Defines the possible types of values for metrics.
NOTE: The string representation is important here, as it is used to automatically determine the type based on the python generic type definition.
Source code in aiperf/common/enums/metric_enums.py
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 | |
converter
cached
property
¶
Get the converter for the metric value type.
default_factory
cached
property
¶
Get the default value generator for the metric value type.
dtype
cached
property
¶
Get the dtype for the metric value type (for pandas/numpy).
info
cached
property
¶
Get the info for the metric value type.
from_python_type(type)
classmethod
¶
Get the MetricValueType for a given type.
Source code in aiperf/common/enums/metric_enums.py
349 350 351 352 353 354 355 356 357 358 359 360 | |
MetricValueTypeInfo
¶
Bases: BasePydanticEnumInfo
Information about a metric value type.
Source code in aiperf/common/enums/metric_enums.py
301 302 303 304 305 306 | |
aiperf.common.enums.model_enums¶
ModelSelectionStrategy
¶
Bases: CaseInsensitiveStrEnum
Strategy for selecting the model to use for the request.
Source code in aiperf/common/enums/model_enums.py
7 8 9 10 11 | |
aiperf.common.enums.post_processor_enums¶
RecordProcessorType
¶
Bases: CaseInsensitiveStrEnum
Type of streaming record processor.
Source code in aiperf/common/enums/post_processor_enums.py
7 8 9 10 11 12 | |
METRIC_RECORD = 'metric_record'
class-attribute
instance-attribute
¶
Streamer that streams records and computes metrics from MetricType.RECORD and MetricType.AGGREGATE. This is the first stage of the metrics processing pipeline, and is done is a distributed manner across multiple service instances.
ResultsProcessorType
¶
Bases: CaseInsensitiveStrEnum
Type of streaming results processor.
Source code in aiperf/common/enums/post_processor_enums.py
15 16 17 18 19 20 | |
METRIC_RESULTS = 'metric_results'
class-attribute
instance-attribute
¶
Processor that processes the metric results from METRIC_RECORD and computes metrics from MetricType.DERIVED. as well as aggregates the results. This is the last stage of the metrics processing pipeline, and is done from the RecordsManager after all the service instances have completed their processing.
aiperf.common.enums.service_enums¶
LifecycleState
¶
Bases: CaseInsensitiveStrEnum
This is the various states a lifecycle can be in.
Source code in aiperf/common/enums/service_enums.py
19 20 21 22 23 24 25 26 27 28 29 | |
ServiceRegistrationStatus
¶
Bases: CaseInsensitiveStrEnum
Defines the various states a service can be in during registration with the SystemController.
Source code in aiperf/common/enums/service_enums.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
ERROR = 'error'
class-attribute
instance-attribute
¶
The service registration failed.
REGISTERED = 'registered'
class-attribute
instance-attribute
¶
The service is registered with the SystemController.
TIMEOUT = 'timeout'
class-attribute
instance-attribute
¶
The service registration timed out.
UNREGISTERED = 'unregistered'
class-attribute
instance-attribute
¶
The service is not registered with the SystemController. This is the initial state.
WAITING = 'waiting'
class-attribute
instance-attribute
¶
The service is waiting for the SystemController to register it. This is a temporary state that should be followed by REGISTERED, TIMEOUT, or ERROR.
ServiceRunType
¶
Bases: CaseInsensitiveStrEnum
The different ways the SystemController should run the component services.
Source code in aiperf/common/enums/service_enums.py
7 8 9 10 11 12 13 14 15 16 | |
KUBERNETES = 'k8s'
class-attribute
instance-attribute
¶
Run each service as a separate Kubernetes pod. This is the default way for multi-node deployments.
MULTIPROCESSING = 'process'
class-attribute
instance-attribute
¶
Run each service as a separate process. This is the default way for single-node deployments.
ServiceType
¶
Bases: CaseInsensitiveStrEnum
Types of services in the AIPerf system.
This is used to identify the service type when registering with the SystemController. It can also be used for tracking purposes if multiple instances of the same service type are running.
Source code in aiperf/common/enums/service_enums.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.enums.sse_enums¶
SSEFieldType
¶
Bases: CaseInsensitiveStrEnum
Field types in an SSE message.
Source code in aiperf/common/enums/sse_enums.py
7 8 9 10 11 12 13 14 | |
aiperf.common.enums.system_enums¶
SystemState
¶
Bases: CaseInsensitiveStrEnum
State of the system as a whole.
This is used to track the state of the system as a whole, and is used to determine what actions to take when a signal is received.
Source code in aiperf/common/enums/system_enums.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | |
CONFIGURING = 'configuring'
class-attribute
instance-attribute
¶
The system is configuring services.
INITIALIZING = 'initializing'
class-attribute
instance-attribute
¶
The system is initializing. This is the initial state.
PROCESSING = 'processing'
class-attribute
instance-attribute
¶
The system is processing results.
PROFILING = 'profiling'
class-attribute
instance-attribute
¶
The system is running a profiling run.
READY = 'ready'
class-attribute
instance-attribute
¶
The system is ready to start profiling. This is a temporary state that should be followed by PROFILING.
SHUTDOWN = 'shutdown'
class-attribute
instance-attribute
¶
The system is shutting down. This is the final state.
STOPPING = 'stopping'
class-attribute
instance-attribute
¶
The system is stopping.
aiperf.common.enums.timing_enums¶
CreditPhase
¶
Bases: CaseInsensitiveStrEnum
The type of credit phase. This is used to identify which phase of the benchmark the credit is being used in, for tracking and reporting purposes.
Source code in aiperf/common/enums/timing_enums.py
30 31 32 33 34 35 36 37 38 39 40 | |
PROFILING = 'profiling'
class-attribute
instance-attribute
¶
The credit phase while profiling is active. This is the primary phase of the benchmark, and what is used to calculate the final results.
WARMUP = 'warmup'
class-attribute
instance-attribute
¶
The credit phase while the warmup is active. This is used to warm up the model and ensure that the model is ready to be profiled.
RequestRateMode
¶
Bases: CaseInsensitiveStrEnum
The different ways the RequestRateStrategy should generate requests.
Source code in aiperf/common/enums/timing_enums.py
20 21 22 23 24 25 26 27 | |
TimingMode
¶
Bases: CaseInsensitiveStrEnum
The different ways the TimingManager should generate requests.
Source code in aiperf/common/enums/timing_enums.py
7 8 9 10 11 12 13 14 15 16 17 | |
CONCURRENCY = 'concurrency'
class-attribute
instance-attribute
¶
A mode where the TimingManager will maintain a continuous stream of concurrent requests.
FIXED_SCHEDULE = 'fixed_schedule'
class-attribute
instance-attribute
¶
A mode where the TimingManager will send requests according to a fixed schedule.
REQUEST_RATE = 'request_rate'
class-attribute
instance-attribute
¶
A mode where the TimingManager will send requests at either a constant request rate or based on a poisson distribution.
aiperf.common.enums.ui_enums¶
AIPerfUIType
¶
Bases: CaseInsensitiveStrEnum
The type of UI to use.
Source code in aiperf/common/enums/ui_enums.py
7 8 9 10 11 12 13 | |
aiperf.common.enums.worker_enums¶
WorkerStatus
¶
Bases: CaseInsensitiveStrEnum
The current status of a worker service.
NOTE: The order of the statuses is important for the UI.
Source code in aiperf/common/enums/worker_enums.py
7 8 9 10 11 12 13 14 15 16 17 | |
aiperf.common.exceptions¶
AIPerfError
¶
Bases: Exception
Base class for all exceptions raised by AIPerf.
Source code in aiperf/common/exceptions.py
10 11 12 13 14 15 16 17 18 19 | |
__str__()
¶
Return the string representation of the exception with the class name.
Source code in aiperf/common/exceptions.py
17 18 19 | |
raw_str()
¶
Return the raw string representation of the exception.
Source code in aiperf/common/exceptions.py
13 14 15 | |
AIPerfMultiError
¶
Bases: AIPerfError
Exception raised when running multiple tasks and one or more fail.
Source code in aiperf/common/exceptions.py
22 23 24 25 26 27 28 29 30 | |
CommunicationError
¶
Bases: AIPerfError
Generic communication error.
Source code in aiperf/common/exceptions.py
49 50 | |
ConfigurationError
¶
Bases: AIPerfError
Exception raised when something fails to configure, or there is a configuration error.
Source code in aiperf/common/exceptions.py
53 54 | |
DatasetError
¶
Bases: AIPerfError
Generic dataset error.
Source code in aiperf/common/exceptions.py
57 58 | |
DatasetGeneratorError
¶
Bases: AIPerfError
Generic dataset generator error.
Source code in aiperf/common/exceptions.py
61 62 | |
FactoryCreationError
¶
Bases: AIPerfError
Exception raised when a factory encounters an error while creating a class.
Source code in aiperf/common/exceptions.py
65 66 | |
InferenceClientError
¶
Bases: AIPerfError
Exception raised when a inference client encounters an error.
Source code in aiperf/common/exceptions.py
73 74 | |
InitializationError
¶
Bases: AIPerfError
Exception raised when something fails to initialize.
Source code in aiperf/common/exceptions.py
69 70 | |
InvalidOperationError
¶
Bases: AIPerfError
Exception raised when an operation is invalid.
Source code in aiperf/common/exceptions.py
77 78 | |
InvalidPayloadError
¶
Bases: InferenceClientError
Exception raised when a inference client receives an invalid payload.
Source code in aiperf/common/exceptions.py
81 82 | |
InvalidStateError
¶
Bases: AIPerfError
Exception raised when something is in an invalid state.
Source code in aiperf/common/exceptions.py
85 86 | |
MetricTypeError
¶
Bases: AIPerfError
Exception raised when a metric type encounters an error while creating a class.
Source code in aiperf/common/exceptions.py
89 90 | |
MetricUnitError
¶
Bases: AIPerfError
Exception raised when trying to convert a metric to or from a unit that is does not support it.
Source code in aiperf/common/exceptions.py
93 94 | |
NotFoundError
¶
Bases: AIPerfError
Exception raised when something is not found or not available.
Source code in aiperf/common/exceptions.py
97 98 | |
NotInitializedError
¶
Bases: AIPerfError
Exception raised when something that should be initialized is not.
Source code in aiperf/common/exceptions.py
101 102 | |
ProxyError
¶
Bases: AIPerfError
Exception raised when a proxy encounters an error.
Source code in aiperf/common/exceptions.py
105 106 | |
ServiceError
¶
Bases: AIPerfError
Generic service error.
Source code in aiperf/common/exceptions.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
ShutdownError
¶
Bases: AIPerfError
Exception raised when a service encounters an error while shutting down.
Source code in aiperf/common/exceptions.py
109 110 | |
UnsupportedHookError
¶
Bases: AIPerfError
Exception raised when a hook is defined on a class that does not have any base classes that provide that hook type.
Source code in aiperf/common/exceptions.py
113 114 | |
ValidationError
¶
Bases: AIPerfError
Exception raised when something fails validation.
Source code in aiperf/common/exceptions.py
117 118 | |
aiperf.common.factories¶
AIPerfFactory
¶
Bases: Generic[ClassEnumT, ClassProtocolT]
Defines a custom factory for AIPerf components.
This class is used to create a factory for a given class type and protocol.
Example:
# Define a new enum for the expected implementation types
# This is optional, but recommended for type safety.
class DatasetLoaderType(CaseInsensitiveStrEnum):
FILE = "file"
S3 = "s3"
# Define a new class protocol.
class DatasetLoaderProtocol(Protocol):
def load(self) -> Dataset:
pass
# Create a new factory for a given class type and protocol.
class DatasetFactory(FactoryMixin[DatasetLoaderType, DatasetLoaderProtocol]):
pass
# Register a new class type mapping to its corresponding class. It should implement the class protocol.
@DatasetFactory.register(DatasetLoaderType.FILE)
class FileDatasetLoader:
def __init__(self, filename: str):
self.filename = filename
def load(self) -> Dataset:
return Dataset.from_file(self.filename)
DatasetConfig = {
"type": DatasetLoaderType.FILE,
"filename": "data.csv"
}
# Create a new instance of the class.
if DatasetConfig["type"] == DatasetLoaderType.FILE:
dataset_instance = DatasetFactory.create_instance(DatasetLoaderType.FILE, filename=DatasetConfig["filename"])
else:
raise ValueError(f"Unsupported dataset loader type: {DatasetConfig['type']}")
dataset_instance.load()
Source code in aiperf/common/factories.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 | |
create_instance(class_type, **kwargs)
classmethod
¶
Create a new class instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_type
|
ClassEnumT | str
|
The type of class to create |
required |
**kwargs
|
Any
|
Additional arguments for the class |
{}
|
Returns:
| Type | Description |
|---|---|
ClassProtocolT
|
The created class instance |
Raises:
| Type | Description |
|---|---|
FactoryCreationError
|
If the class type is not registered or there is an error creating the instance |
Source code in aiperf/common/factories.py
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 | |
get_all_class_types()
classmethod
¶
Get all registered class types.
Source code in aiperf/common/factories.py
235 236 237 238 | |
get_all_classes()
classmethod
¶
Get all registered classes.
Returns:
| Type | Description |
|---|---|
list[type[ClassProtocolT]]
|
A list of all registered class types implementing the expected protocol |
Source code in aiperf/common/factories.py
226 227 228 229 230 231 232 233 | |
get_all_classes_and_types()
classmethod
¶
Get all registered classes and their corresponding class types.
Source code in aiperf/common/factories.py
240 241 242 243 244 245 | |
get_class_from_type(class_type)
classmethod
¶
Get the class from a class type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_type
|
ClassEnumT | str
|
The class type to get the class from |
required |
Returns:
| Type | Description |
|---|---|
type[ClassProtocolT]
|
The class for the given class type |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the class type is not registered |
Source code in aiperf/common/factories.py
207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | |
register(class_type, override_priority=0)
classmethod
¶
Register a new class type mapping to its corresponding class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_type
|
ClassEnumT | str
|
The type of class to register |
required |
override_priority
|
int
|
The priority of the override. The higher the priority, the more precedence the override has when multiple classes are registered for the same class type. Built-in classes have a priority of 0. |
0
|
Returns:
| Type | Description |
|---|---|
Callable
|
Decorator for the class that implements the class protocol |
Source code in aiperf/common/factories.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | |
register_all(*class_types, override_priority=0)
classmethod
¶
Register multiple class types mapping to a single corresponding class. This is useful if a single class implements multiple types. Currently only supports registering as a single override priority for all types.
Source code in aiperf/common/factories.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
AIPerfSingletonFactory
¶
Bases: AIPerfFactory[ClassEnumT, ClassProtocolT]
Factory for registering and creating singleton instances of a given class type and protocol.
This factory is useful for creating instances that are shared across the application, such as communication clients.
Calling create_instance will create a new instance if it doesn't exist, otherwise it will return the existing instance.
Calling get_instance will return the existing instance if it exists, otherwise it will raise an error.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 | |
create_instance(class_type, **kwargs)
classmethod
¶
Create a new instance of the given class type. If the instance does not exist, or the process ID has changed, a new instance will be created.
Source code in aiperf/common/factories.py
279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 | |
get_or_create_instance(class_type, **kwargs)
classmethod
¶
Syntactic sugar for create_instance, but with a more descriptive name for singleton factories.
Source code in aiperf/common/factories.py
272 273 274 275 276 277 | |
AIPerfUIFactory
¶
Bases: AIPerfSingletonFactory[AIPerfUIType, 'AIPerfUIProtocol']
Factory for registering and creating AIPerfUIProtocol instances based on the specified AIPerfUI type.
see: :class:aiperf.common.factories.AIPerfSingletonFactory for more details.
Source code in aiperf/common/factories.py
336 337 338 339 | |
CommunicationClientFactory
¶
Bases: AIPerfFactory[CommClientType, 'CommunicationClientProtocol']
Factory for registering and creating CommunicationClientProtocol instances based on the specified communication client type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 | |
CommunicationFactory
¶
Bases: AIPerfSingletonFactory[CommunicationBackend, 'CommunicationProtocol']
Factory for registering and creating CommunicationProtocol instances based on the specified communication backend.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 | |
ComposerFactory
¶
Bases: AIPerfFactory[ComposerType, 'BaseDatasetComposer']
Factory for registering and creating BaseDatasetComposer instances based on the specified composer type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
380 381 382 383 384 385 386 387 388 389 390 391 | |
ConsoleExporterFactory
¶
Bases: AIPerfFactory[ConsoleExporterType, 'ConsoleExporterProtocol']
Factory for registering and creating ConsoleExporterProtocol instances based on the specified data exporter type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 | |
CustomDatasetFactory
¶
Bases: AIPerfFactory[CustomDatasetType, 'CustomDatasetLoaderProtocol']
Factory for registering and creating CustomDatasetLoaderProtocol instances based on the specified custom dataset type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
413 414 415 416 417 418 419 420 421 422 423 424 425 426 | |
DataExporterFactory
¶
Bases: AIPerfFactory[DataExporterType, 'DataExporterProtocol']
Factory for registering and creating DataExporterProtocol instances based on the specified data exporter type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 | |
InferenceClientFactory
¶
Bases: AIPerfFactory[EndpointType, 'InferenceClientProtocol']
Factory for registering and creating InferenceClientProtocol instances based on the specified endpoint type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 | |
RecordProcessorFactory
¶
Bases: AIPerfFactory[RecordProcessorType, 'RecordProcessorProtocol']
Factory for registering and creating RecordProcessorProtocol instances based on the specified record processor type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 | |
RequestConverterFactory
¶
Bases: AIPerfSingletonFactory[EndpointType, 'RequestConverterProtocol']
Factory for registering and creating RequestConverterProtocol instances based on the specified request payload type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
463 464 465 466 467 468 | |
ResponseExtractorFactory
¶
Bases: AIPerfFactory[EndpointType, 'ResponseExtractorProtocol']
Factory for registering and creating ResponseExtractorProtocol instances based on the specified response extractor type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 | |
ResultsProcessorFactory
¶
Bases: AIPerfFactory[ResultsProcessorType, 'ResultsProcessorProtocol']
Factory for registering and creating ResultsProcessorProtocol instances based on the specified results processor type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 | |
ServiceFactory
¶
Bases: AIPerfFactory[ServiceType, 'ServiceProtocol']
Factory for registering and creating ServiceProtocol instances based on the specified service type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 | |
ServiceManagerFactory
¶
Bases: AIPerfFactory[ServiceRunType, 'ServiceManagerProtocol']
Factory for registering and creating ServiceManagerProtocol instances based on the specified service run type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 | |
ZMQProxyFactory
¶
Bases: AIPerfFactory[ZMQProxyType, 'BaseZMQProxy']
Factory for registering and creating BaseZMQProxy instances based on the specified ZMQ proxy type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 | |
aiperf.common.hooks¶
This module provides an extensive set of hook definitions for AIPerf. It is designed to be
used in conjunction with the :class:HooksMixin for classes to provide support for hooks.
It provides a simple interface for registering hooks.
Classes should inherit from the :class:HooksMixin, and specify the provided
hook types by decorating the class with the :func:provides_hooks decorator.
The hook functions are registered by decorating functions with the various hook
decorators such as :func:on_init, :func:on_start, :func:on_stop, etc.
More than one hook can be registered for a given hook type, and classes that inherit from classes with existing hooks will inherit the hooks from the base classes as well.
The hooks are run by calling the :meth:HooksMixin.run_hooks method or retrieved via the
:meth:HooksMixin.get_hooks method on the class.
HookType = AIPerfHook | str
module-attribute
¶
Type alias for valid hook types. This is a union of the AIPerfHook enum and any user-defined custom strings.
Hook
¶
Bases: BaseModel, Generic[HookParamsT]
A hook is a function that is decorated with a hook type and optional parameters. The HookParamsT is the type of the parameters. You can either have a static value, or a callable that returns the parameters.
Source code in aiperf/common/hooks.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | |
resolve_params(self_obj)
¶
Resolve the parameters for the hook. If the parameters are a callable, it will be called with the self_obj as the argument, otherwise the parameters are returned as is.
Source code in aiperf/common/hooks.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | |
HookAttrs
¶
Constant attribute names for hooks.
When you decorate a function with a hook decorator, the hook type and parameters are set as attributes on the function or class.
Source code in aiperf/common/hooks.py
62 63 64 65 66 67 68 69 70 71 | |
background_task(interval=None, immediate=True, stop_on_error=False)
¶
Decorator to mark a method as a background task with automatic management.
Tasks are automatically started when the service starts and stopped when the service stops. The decorated method will be run periodically in the background when the service is running.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
interval
|
float | Callable[[SelfT], float] | None
|
Time between task executions in seconds. If None, the task will run once. Can be a callable that returns the interval, and will be called with 'self' as the argument. |
None
|
immediate
|
bool
|
If True, run the task immediately on start, otherwise wait for the interval first. |
True
|
stop_on_error
|
bool
|
If True, stop the task on any exception, otherwise log and continue. |
False
|
Example:
class MyPlugin(AIPerfLifecycleMixin):
@background_task(interval=1.0)
def _background_task(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._background_task.__aiperf_hook_type__ = AIPerfHook.BACKGROUND_TASK
MyPlugin._background_task.__aiperf_hook_params__ = BackgroundTaskParams(
interval=1.0, immediate=True, stop_on_error=False
)
Source code in aiperf/common/hooks.py
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | |
on_command(*command_types)
¶
Decorator to specify that the function is a hook that should be called when a CommandMessage with the given
command type(s) is received from the message bus.
See :func:aiperf.common.hooks._hook_decorator_for_message_types.
Example:
class MyService(BaseComponentService):
@on_command(CommandType.PROFILE_START)
def _on_profile_start(self, message: ProfileStartCommand) -> CommandResponse:
pass
The above is the equivalent to setting:
MyService._on_profile_start.__aiperf_hook_type__ = AIPerfHook.ON_COMMAND
MyService._on_profile_start.__aiperf_hook_params__ = (CommandType.PROFILE_START,)
Source code in aiperf/common/hooks.py
426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 | |
on_init(func)
¶
Decorator to specify that the function is a hook that should be called during initialization.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_init
def _init_plugin(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._init_plugin.__aiperf_hook_type__ = AIPerfHook.ON_INIT
Source code in aiperf/common/hooks.py
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 | |
on_message(*message_types)
¶
Decorator to specify that the function is a hook that should be called when messages of the
given type(s) (or topics) are received from the message bus.
See :func:aiperf.common.hooks._hook_decorator_with_params.
Example:
class MyService(MessageBusClientMixin):
@on_message(MessageType.STATUS)
def _on_status_message(self, message: StatusMessage) -> None:
pass
The above is the equivalent to setting:
MyService._on_status_message.__aiperf_hook_type__ = AIPerfHook.ON_MESSAGE
MyService._on_status_message.__aiperf_hook_params__ = (MessageType.STATUS,)
Source code in aiperf/common/hooks.py
311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 | |
on_profiling_progress(func)
¶
Decorator to specify that the function is a hook that should be called when a profiling progress update is received.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(ProgressTrackerMixin):
@on_profiling_progress
def _on_profiling_progress(self, profiling_stats: RequestsStats) -> None:
pass
The above is the equivalent to setting:
MyPlugin._on_profiling_progress.__aiperf_hook_type__ = AIPerfHook.ON_PROFILING_PROGRESS
Source code in aiperf/common/hooks.py
358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 | |
on_pull_message(*message_types)
¶
Decorator to specify that the function is a hook that should be called a pull client
receives a message of the given type(s).
See :func:aiperf.common.hooks._hook_decorator_for_message_types.
Example:
class MyService(PullClientMixin, BaseComponentService):
@on_pull_message(MessageType.CREDIT_DROP)
def _on_credit_drop_pull(self, message: CreditDropMessage) -> None:
pass
The above is the equivalent to setting: ```python MyService._on_pull_message.aiperf_hook_type = AIPerfHook.ON_PULL_MESSAGE MyService._on_pull_message.aiperf_hook_params = (MessageType.CREDIT_DROP,)
Source code in aiperf/common/hooks.py
335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 | |
on_records_progress(func)
¶
Decorator to specify that the function is a hook that should be called when a records progress update is received.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(ProgressTrackerMixin):
@on_records_progress
def _on_records_progress(self, progress: RecordsStats) -> None:
pass
The above is the equivalent to setting:
MyPlugin._on_records_progress.__aiperf_hook_type__ = AIPerfHook.ON_RECORDS_PROGRESS
Source code in aiperf/common/hooks.py
378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 | |
on_request(*message_types)
¶
Decorator to specify that the function is a hook that should be called when requests of the
given type(s) are received from a ReplyClient.
See :func:aiperf.common.hooks._hook_decorator_for_message_types.
Example:
class MyService(RequestClientMixin, BaseComponentService):
@on_request(MessageType.CONVERSATION_REQUEST)
async def _handle_conversation_request(
self, message: ConversationRequestMessage
) -> ConversationResponseMessage:
return ConversationResponseMessage(
...
)
The above is the equivalent to setting:
MyService._handle_conversation_request.__aiperf_hook_type__ = AIPerfHook.ON_REQUEST
MyService._handle_conversation_request.__aiperf_hook_params__ = (MessageType.CONVERSATION_REQUEST,)
Source code in aiperf/common/hooks.py
398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 | |
on_start(func)
¶
Decorator to specify that the function is a hook that should be called during start.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_start
def _start_plugin(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._start_plugin.__aiperf_hook_type__ = AIPerfHook.ON_START
Source code in aiperf/common/hooks.py
249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 | |
on_state_change(func)
¶
Decorator to specify that the function is a hook that should be called during the service state change.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_state_change
def _on_state_change(self, old_state: LifecycleState, new_state: LifecycleState) -> None:
pass
The above is the equivalent to setting:
MyPlugin._on_state_change.__aiperf_hook_type__ = AIPerfHook.ON_STATE_CHANGE
Source code in aiperf/common/hooks.py
289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 | |
on_stop(func)
¶
Decorator to specify that the function is a hook that should be called during stop.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_stop
def _stop_plugin(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._stop_plugin.__aiperf_hook_type__ = AIPerfHook.ON_STOP
Source code in aiperf/common/hooks.py
269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | |
on_warmup_progress(func)
¶
Decorator to specify that the function is a hook that should be called when a warmup progress update is received.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(ProgressTrackerMixin):
@on_warmup_progress
def _on_warmup_progress(self, warmup_stats: RequestsStats) -> None:
pass
The above is the equivalent to setting:
MyPlugin._on_warmup_progress.__aiperf_hook_type__ = AIPerfHook.ON_WARMUP_PROGRESS
Source code in aiperf/common/hooks.py
450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 | |
on_worker_status_summary(func)
¶
Decorator to specify that the function is a hook that should be called when a worker status summary is received
from the WorkerManager.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(WorkerTrackerMixin):
@on_worker_status_summary
def _on_worker_status_summary(self, worker_statuses: dict[str, WorkerStatus]) -> None:
pass
The above is the equivalent to setting:
MyPlugin._on_worker_status_summary.__aiperf_hook_type__ = AIPerfHook.ON_WORKER_STATUS_SUMMARY
Source code in aiperf/common/hooks.py
470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 | |
on_worker_update(func)
¶
Decorator to specify that the function is a hook that should be called when a worker update is received.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(WorkerTrackerMixin):
@on_worker_update
def _on_worker_update(self, worker_id: str, worker_stats: WorkerStats) -> None:
pass
The above is the equivalent to setting:
MyPlugin._on_worker_update.__aiperf_hook_type__ = AIPerfHook.ON_WORKER_UPDATE
Source code in aiperf/common/hooks.py
491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 | |
provides_hooks(*hook_types)
¶
Decorator to specify that the class provides a hook of the given type to all of its subclasses.
Example:
@provides_hooks(AIPerfHook.ON_MESSAGE)
class MessageBusClientMixin(CommunicationMixin):
pass
The above is the equivalent to setting:
MessageBusClientMixin.__provides_hooks__ = {AIPerfHook.ON_MESSAGE}
Source code in aiperf/common/hooks.py
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | |
aiperf.common.logging¶
MultiProcessLogHandler
¶
Bases: RichHandler
Custom logging handler that forwards log records to a multiprocessing queue.
Source code in aiperf/common/logging.py
174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 | |
emit(record)
¶
Emit a log record to the queue.
Source code in aiperf/common/logging.py
184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 | |
create_file_handler(log_folder, level)
¶
Configure a file handler for logging.
Source code in aiperf/common/logging.py
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 | |
get_global_log_queue()
cached
¶
Get the global log queue. Will create a new queue if it doesn't exist.
Source code in aiperf/common/logging.py
24 25 26 27 | |
setup_child_process_logging(log_queue=None, service_id=None, service_config=None, user_config=None)
¶
Set up logging for a child process to send logs to the main process.
This should be called early in child process initialization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
log_queue
|
Queue | None
|
The multiprocessing queue to send logs to. If None, tries to get the global queue. |
None
|
service_id
|
str | None
|
The ID of the service to log under. If None, logs will be under the process name. |
None
|
service_config
|
ServiceConfig | None
|
The service configuration used to determine the log level. |
None
|
user_config
|
UserConfig | None
|
The user configuration used to determine the log folder. |
None
|
Source code in aiperf/common/logging.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
setup_rich_logging(user_config, service_config)
¶
Set up rich logging with appropriate configuration.
Source code in aiperf/common/logging.py
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
aiperf.common.messages.base_messages¶
ErrorMessage
¶
Bases: Message
Message containing error data.
Source code in aiperf/common/messages/base_messages.py
112 113 114 115 116 117 | |
Message
¶
Bases: AIPerfBaseModel
Base message class for optimized message handling. Based on the AIPerfBaseModel class,
so it supports @exclude_if_none decorator. see :class:AIPerfBaseModel for more details.
This class provides a base for all messages, including common fields like message_type, request_ns, and request_id. It also supports optional field exclusion based on the @exclude_if_none decorator.
Each message model should inherit from this class, set the message_type field, and define its own additional fields.
Example:
@exclude_if_none("some_field")
class ExampleMessage(Message):
some_field: int | None = Field(default=None)
other_field: int = Field(default=1)
Source code in aiperf/common/messages/base_messages.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
from_json(json_str)
classmethod
¶
Deserialize a message from a JSON string, attempting to auto-detect the message type.
NOTE: If you already know the message type, use the more performant :meth:from_json_with_type instead.
Source code in aiperf/common/messages/base_messages.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
from_json_with_type(message_type, json_str)
classmethod
¶
Deserialize a message from a JSON string with a specific message type.
NOTE: This is more performant than :meth:from_json because it does not need to
convert the JSON string to a dictionary first.
Source code in aiperf/common/messages/base_messages.py
86 87 88 89 90 91 92 93 94 95 96 97 | |
RequiresRequestNSMixin
¶
Bases: Message
Mixin for messages that require a request_ns field.
Source code in aiperf/common/messages/base_messages.py
103 104 105 106 107 108 109 | |
aiperf.common.messages.command_messages¶
CommandMessage
¶
Bases: TargetedServiceMessage
Message containing command data. This message is sent by the system controller to a service to command it to do something.
Source code in aiperf/common/messages/command_messages.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
from_json(json_str)
classmethod
¶
Deserialize a command message from a JSON string, attempting to auto-detect the command type.
Source code in aiperf/common/messages/command_messages.py
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
CommandResponse
¶
Bases: TargetedServiceMessage
Message containing a command response.
Source code in aiperf/common/messages/command_messages.py
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | |
from_json(json_str)
classmethod
¶
Deserialize a command response message from a JSON string, attempting to auto-detect the command response type.
Source code in aiperf/common/messages/command_messages.py
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | |
CommandSuccessResponse
¶
Bases: CommandResponse
Generic command response message when a command succeeds. It should be subclassed for specific command types.
Source code in aiperf/common/messages/command_messages.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | |
ConnectionProbeMessage
¶
Bases: TargetedServiceMessage
Message containing a connection probe from a service. This is used to probe the connection to the service.
Source code in aiperf/common/messages/command_messages.py
345 346 347 348 | |
ProcessRecordsCommand
¶
Bases: CommandMessage
Data to send with the process records command.
Source code in aiperf/common/messages/command_messages.py
284 285 286 287 288 289 290 291 292 | |
ProcessRecordsResponse
¶
Bases: CommandSuccessResponse
Response to the process records command.
Source code in aiperf/common/messages/command_messages.py
334 335 336 337 338 339 340 341 342 | |
ProfileCancelCommand
¶
Bases: CommandMessage
Command message sent to request services to cancel profiling.
Source code in aiperf/common/messages/command_messages.py
310 311 312 313 | |
ProfileConfigureCommand
¶
Bases: CommandMessage
Data to send with the profile configure command.
Source code in aiperf/common/messages/command_messages.py
295 296 297 298 299 300 301 | |
ProfileStartCommand
¶
Bases: CommandMessage
Command message sent to request services to start profiling.
Source code in aiperf/common/messages/command_messages.py
304 305 306 307 | |
RegisterServiceCommand
¶
Bases: CommandMessage
Command message sent from a service to the system controller to register itself.
Source code in aiperf/common/messages/command_messages.py
322 323 324 325 326 327 328 329 330 331 | |
ShutdownCommand
¶
Bases: CommandMessage
Command message sent to request a service to shutdown.
Source code in aiperf/common/messages/command_messages.py
316 317 318 319 | |
TargetedServiceMessage
¶
Bases: BaseServiceMessage
Message that can be targeted to a specific service by id or type.
If both target_service_type and target_service_id are None, the message is
sent to all services that are subscribed to the message type.
Source code in aiperf/common/messages/command_messages.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
aiperf.common.messages.credit_messages¶
CreditDropMessage
¶
Bases: BaseServiceMessage
Message indicating that a credit has been dropped. This message is sent by the timing manager to workers to indicate that credit(s) have been dropped.
Source code in aiperf/common/messages/credit_messages.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | |
CreditPhaseCompleteMessage
¶
Bases: BaseServiceMessage
Message for credit phase complete. Sent by the TimingManager to report that a credit phase has completed.
Source code in aiperf/common/messages/credit_messages.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | |
CreditPhaseProgressMessage
¶
Bases: BaseServiceMessage
Sent by the TimingManager to report the progress of a credit phase.
Source code in aiperf/common/messages/credit_messages.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
CreditPhaseSendingCompleteMessage
¶
Bases: BaseServiceMessage
Message for credit phase sending complete. Sent by the TimingManager to report that a credit phase has completed sending.
Source code in aiperf/common/messages/credit_messages.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | |
CreditPhaseStartMessage
¶
Bases: BaseServiceMessage
Message for credit phase start. Sent by the TimingManager to report that a credit phase has started.
Source code in aiperf/common/messages/credit_messages.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
CreditReturnMessage
¶
Bases: BaseServiceMessage
Message indicating that a credit has been returned. This message is sent by a worker to the timing manager to indicate that work has been completed.
Source code in aiperf/common/messages/credit_messages.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
CreditsCompleteMessage
¶
Bases: BaseServiceMessage
Credits complete message sent by the TimingManager to the System controller to signify all Credit Phases have been completed.
Source code in aiperf/common/messages/credit_messages.py
125 126 127 128 129 | |
aiperf.common.messages.dataset_messages¶
ConversationRequestMessage
¶
Bases: BaseServiceMessage
Message to request a full conversation by ID.
Source code in aiperf/common/messages/dataset_messages.py
12 13 14 15 16 17 18 19 20 21 22 23 | |
ConversationResponseMessage
¶
Bases: BaseServiceMessage
Message containing a full conversation.
Source code in aiperf/common/messages/dataset_messages.py
26 27 28 29 30 | |
ConversationTurnRequestMessage
¶
Bases: BaseServiceMessage
Message to request a single turn from a conversation.
Source code in aiperf/common/messages/dataset_messages.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
ConversationTurnResponseMessage
¶
Bases: BaseServiceMessage
Message containing a single turn from a conversation.
Source code in aiperf/common/messages/dataset_messages.py
49 50 51 52 53 54 | |
DatasetConfiguredNotification
¶
Bases: BaseServiceMessage
Notification sent to notify other services that the dataset has been configured.
Source code in aiperf/common/messages/dataset_messages.py
74 75 76 77 | |
DatasetTimingRequest
¶
Bases: BaseServiceMessage
Message for a dataset timing request.
Source code in aiperf/common/messages/dataset_messages.py
57 58 59 60 | |
DatasetTimingResponse
¶
Bases: BaseServiceMessage
Message for a dataset timing response.
Source code in aiperf/common/messages/dataset_messages.py
63 64 65 66 67 68 69 70 71 | |
aiperf.common.messages.inference_messages¶
InferenceResultsMessage
¶
Bases: BaseServiceMessage
Message for a inference results.
Source code in aiperf/common/messages/inference_messages.py
19 20 21 22 23 24 25 26 | |
MetricRecordsMessage
¶
Bases: BaseServiceMessage
Message from the result parser to the records manager to notify it of the metric records for a single request.
Source code in aiperf/common/messages/inference_messages.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | |
valid
property
¶
Whether the request was valid.
ParsedInferenceResultsMessage
¶
Bases: BaseServiceMessage
Message for a parsed inference results.
Source code in aiperf/common/messages/inference_messages.py
29 30 31 32 33 34 35 36 37 38 39 | |
aiperf.common.messages.progress_messages¶
AllRecordsReceivedMessage
¶
Bases: BaseServiceMessage, RequiresRequestNSMixin
This is sent by the RecordsManager to signal that all parsed records have been received, and the final processing stats are available.
Source code in aiperf/common/messages/progress_messages.py
83 84 85 86 87 88 89 | |
ProcessRecordsResultMessage
¶
Bases: BaseServiceMessage
Message for process records result.
Source code in aiperf/common/messages/progress_messages.py
92 93 94 95 96 97 | |
ProcessingStatsMessage
¶
Bases: BaseServiceMessage
Message for processing stats. Sent by the records manager to the system controller to report the stats of the profile run.
Source code in aiperf/common/messages/progress_messages.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
ProfileProgressMessage
¶
Bases: BaseServiceMessage
Message for profile progress. Sent by the timing manager to the system controller to report the progress of the profile run.
Source code in aiperf/common/messages/progress_messages.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | |
ProfileResultsMessage
¶
Bases: BaseServiceMessage
Message for profile results.
Source code in aiperf/common/messages/progress_messages.py
75 76 77 78 79 80 | |
RecordsProcessingStatsMessage
¶
Bases: BaseServiceMessage
Message for processing stats. Sent by the RecordsManager to report the stats of the profile run. This contains the stats for a single credit phase only.
Source code in aiperf/common/messages/progress_messages.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
aiperf.common.messages.service_messages¶
BaseServiceErrorMessage
¶
Bases: BaseServiceMessage
Base message containing error data.
Source code in aiperf/common/messages/service_messages.py
73 74 75 76 77 78 | |
BaseServiceMessage
¶
Bases: Message
Base message that is sent from a service. Requires a service_id field to specify the service that sent the message.
Source code in aiperf/common/messages/service_messages.py
18 19 20 21 22 23 24 25 | |
BaseStatusMessage
¶
Bases: BaseServiceMessage
Base message containing status data. This message is sent by a service to the system controller to report its status.
Source code in aiperf/common/messages/service_messages.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
HeartbeatMessage
¶
Bases: BaseStatusMessage
Message containing heartbeat data. This message is sent by a service to the system controller to indicate that it is still running.
Source code in aiperf/common/messages/service_messages.py
64 65 66 67 68 69 70 | |
RegistrationMessage
¶
Bases: BaseStatusMessage
Message containing registration data. This message is sent by a service to the system controller to register itself.
Source code in aiperf/common/messages/service_messages.py
56 57 58 59 60 61 | |
StatusMessage
¶
Bases: BaseStatusMessage
Message containing status data. This message is sent by a service to the system controller to report its status.
Source code in aiperf/common/messages/service_messages.py
48 49 50 51 52 53 | |
aiperf.common.messages.worker_messages¶
WorkerHealthMessage
¶
Bases: BaseServiceMessage
Message for a worker health check.
Source code in aiperf/common/messages/worker_messages.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | |
error_rate
property
¶
The error rate of the worker.
WorkerStatusSummaryMessage
¶
Bases: BaseServiceMessage
Message for a worker status summary.
Source code in aiperf/common/messages/worker_messages.py
34 35 36 37 38 39 40 41 42 | |
aiperf.common.mixins.aiperf_lifecycle_mixin¶
AIPerfLifecycleMixin
¶
Bases: TaskManagerMixin, HooksMixin
This mixin provides a lifecycle state machine, and is the basis for most components in the AIPerf framework. It provides a set of hooks that are run at each state transition, and the ability to define background tasks that are automatically ran on @on_start, and canceled via @on_stop.
It exposes to the outside world initialize, start, and stop methods, as well as getting the
current state of the lifecycle. These simple methods promote a simple interface for users to interact with.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 | |
is_running
property
¶
Whether the lifecycle's current state is LifecycleState.RUNNING.
stop_requested
property
writable
¶
Whether the lifecycle has been requested to stop.
__init__(id=None, **kwargs)
¶
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
id
|
str | None
|
The id of the lifecycle. If not provided, a random uuid will be generated. |
None
|
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
attach_child_lifecycle(child)
¶
Attach a child lifecycle to manage. This child will now have its lifecycle managed and controlled by this lifecycle. Common use cases are having a Service be a parent lifecycle, and having supporting components such as streaming post processors, progress reporters, etc. be children.
Children will be called in the order they were attached for initialize and start, and in reverse order for stop.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
241 242 243 244 245 246 247 248 249 250 251 252 253 254 | |
initialize()
async
¶
Initialize the lifecycle and run the @on_init hooks.
NOTE: It is generally discouraged from overriding this method. Instead, use the @on_init hook to handle your own initialization logic.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |
initialize_and_start()
async
¶
Initialize and start the lifecycle. This is a convenience method that calls initialize and start in sequence.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
182 183 184 185 | |
start()
async
¶
Start the lifecycle and run the @on_start hooks.
NOTE: It is generally discouraged from overriding this method. Instead, use the @on_start hook to handle your own starting logic.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
stop()
async
¶
Stop the lifecycle and run the @on_stop hooks.
NOTE: It is generally discouraged from overriding this method. Instead, use the @on_stop hook to handle your own stopping logic.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 | |
aiperf.common.mixins.aiperf_logger_mixin¶
AIPerfLoggerMixin
¶
Bases: BaseMixin
Mixin to provide lazy evaluated logging for f-strings.
This mixin provides a logger with lazy evaluation support for f-strings, and direct log functions for all standard and custom logging levels.
see :class:AIPerfLogger for more details.
Usage
class MyClass(AIPerfLoggerMixin): def init(self): super().init() self.trace(lambda: f"Processing {item} of {count} ({item / count * 100}% complete)") self.info("Simple string message") self.debug(lambda i=i: f"Binding loop variable: {i}") self.warning("Warning message: %s", "legacy support") self.success("Benchmark completed successfully") self.notice("Warmup has completed") self.exception(f"Direct f-string usage: {e}")
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | |
critical(message, *args, **kwargs)
¶
Log a critical message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
109 110 111 112 | |
debug(message, *args, **kwargs)
¶
Log a debug message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
74 75 76 77 | |
error(message, *args, **kwargs)
¶
Log an error message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
99 100 101 102 | |
exception(message, *args, **kwargs)
¶
Log an exception message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
104 105 106 107 | |
info(message, *args, **kwargs)
¶
Log an info message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
79 80 81 82 | |
log(level, message, *args, **kwargs)
¶
Log a message at a specified level with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
62 63 64 65 66 67 | |
notice(message, *args, **kwargs)
¶
Log a notice message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
84 85 86 87 | |
success(message, *args, **kwargs)
¶
Log a success message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
94 95 96 97 | |
trace(message, *args, **kwargs)
¶
Log a trace message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
69 70 71 72 | |
warning(message, *args, **kwargs)
¶
Log a warning message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
89 90 91 92 | |
aiperf.common.mixins.base_mixin¶
BaseMixin
¶
Base mixin class.
This Mixin creates a contract that Mixins should always pass **kwargs to super().init, regardless of whether they extend another mixin or not.
This will ensure that the BaseMixin is the last mixin to have its init method called, which means that all other mixins will have a proper chain of init methods with the correct arguments and no accidental broken inheritance.
Source code in aiperf/common/mixins/base_mixin.py
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
aiperf.common.mixins.command_handler_mixin¶
CommandHandlerMixin
¶
Bases: MessageBusClientMixin, ABC
Mixin to provide command handling functionality to a service.
This mixin is used by the BaseService class, and is not intended to be used directly.
Source code in aiperf/common/mixins/command_handler_mixin.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 | |
send_command_and_wait_for_all_responses(command, service_ids, timeout=DEFAULT_COMMAND_RESPONSE_TIMEOUT)
async
¶
Broadcast a command message to multiple services and wait for the responses from all of the specified services. This is useful for the system controller to send one command but wait for all services to respond.
Source code in aiperf/common/mixins/command_handler_mixin.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 | |
send_command_and_wait_for_response(message, timeout=DEFAULT_COMMAND_RESPONSE_TIMEOUT)
async
¶
Send a single command message to a single service and wait for the response. This is useful communicating directly with a single service.
Source code in aiperf/common/mixins/command_handler_mixin.py
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 | |
aiperf.common.mixins.communication_mixin¶
CommunicationMixin
¶
Bases: AIPerfLifecycleMixin, ABC
Mixin to provide access to a CommunicationProtocol instance. This mixin should be inherited by any mixin that needs access to the communication layer to create Communication clients.
Source code in aiperf/common/mixins/communication_mixin.py
12 13 14 15 16 17 18 19 20 21 22 23 24 | |
aiperf.common.mixins.hooks_mixin¶
HooksMixin
¶
Bases: AIPerfLoggerMixin
Mixin for a class to be able to provide hooks to its subclasses, and to be able to run them. A "hook" is a function that is decorated with a hook type (AIPerfHook), and optional parameters.
In order to provide hooks, a class MUST use the @provides_hooks decorator to declare the hook types it provides.
Only list hook types that you call get_hooks or run_hooks on, to get or run the functions that are decorated
with those hook types.
Provided hooks are recursively inherited by subclasses, so if a base class provides a hook,
all subclasses will also provide that hook (without having to explicitly declare it, or call get_hooks or run_hooks).
In fact, you typically should not get or run hooks from the base class, as this may lead to calling hooks twice.
Hooks are registered in the order they are defined within the same class from top to bottom, and each class's hooks are inspected starting with hooks defined in the lowest level of base classes, moving up to the highest subclass.
IMPORTANT:
- Hook decorated methods from one class can be named the same as methods in their base classes, and BOTH will be registered.
Meaning if class A and class B both have a method named _initialize, which is decorated with @on_init, and class B inherits from class A,
then both _initialize methods will be registered as hooks, and will be run in the order A._initialize, then B._initialize.
This is done without requiring the user to call super()._initialize in the subclass, as the base class hook will be run automatically.
However, the caveat is that it is not possible to disable the hook from the base class without extra work, and if the user does accidentally
call super()._initialize in the subclass, the base class hook may be called twice.
Source code in aiperf/common/mixins/hooks_mixin.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
for_each_hook_param(*hook_types, self_obj, param_type, lambda_func, reverse=False)
¶
Iterate over the hooks for the given hook type(s), optionally reversed. If a lambda_func is provided, it will be called for each parameter of the hook, and the hook and parameter will be passed as arguments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hook_types
|
HookType
|
The hook types to iterate over. |
()
|
self_obj
|
Any
|
The object to pass to the lambda_func. |
required |
param_type
|
AnyT
|
The type of the parameter to pass to the lambda_func (for validation). |
required |
lambda_func
|
Callable[[Hook, AnyT], None]
|
The function to call for each hook. |
required |
reverse
|
bool
|
Whether to iterate over the hooks in reverse order. |
False
|
Source code in aiperf/common/mixins/hooks_mixin.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
get_hooks(*hook_types, reverse=False)
¶
Get the hooks that are defined by the class for the given hook type(s), optionally reversed. This will return a list of Hook objects that can be inspected for their type and parameters, and optionally called.
Source code in aiperf/common/mixins/hooks_mixin.py
85 86 87 88 89 90 91 92 93 94 95 96 97 | |
run_hooks(*hook_types, reverse=False, **kwargs)
async
¶
Run the hooks for the given hook type, waiting for each hook to complete before running the next one. Hooks are run in the order they are defined by the class, starting with hooks defined in the lowest level of base classes, moving up to the top level class. If more than one hook type is provided, the hooks from each level of classes will be run in the order of the hook types provided.
If reverse is True, the hooks will be run in reverse order. This is useful for stop/cleanup hooks, where you want to start with the children and ending with the parent.
The kwargs are passed through as keyword arguments to each hook.
Source code in aiperf/common/mixins/hooks_mixin.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
aiperf.common.mixins.message_bus_mixin¶
MessageBusClientMixin
¶
Bases: CommunicationMixin, ABC
Mixin to provide message bus clients (pub and sub)for AIPerf components, as well as a hook to handle messages: @on_message.
Source code in aiperf/common/mixins/message_bus_mixin.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
publish(message)
async
¶
Publish a message. The message will be routed automatically based on the message type.
Source code in aiperf/common/mixins/message_bus_mixin.py
139 140 141 | |
subscribe(message_type, callback)
async
¶
Subscribe to a specific message type. The callback will be called when a message is received for the given message type.
Source code in aiperf/common/mixins/message_bus_mixin.py
118 119 120 121 122 123 124 125 | |
subscribe_all(message_callback_map)
async
¶
Subscribe to all message types in the map. The callback(s) will be called when a message is received for the given message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message_callback_map
|
MessageCallbackMapT
|
A map of message types to callbacks. The callbacks can be a single callback or a list of callbacks. |
required |
Source code in aiperf/common/mixins/message_bus_mixin.py
127 128 129 130 131 132 133 134 135 136 137 | |
aiperf.common.mixins.process_health_mixin¶
ProcessHealthMixin
¶
Bases: BaseMixin
Mixin to provide process health information.
Source code in aiperf/common/mixins/process_health_mixin.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
get_process_health()
¶
Get the process health information for the current process.
Source code in aiperf/common/mixins/process_health_mixin.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
aiperf.common.mixins.progress_tracker_mixin¶
ProgressTrackerMixin
¶
Bases: MessageBusClientMixin
A progress tracker that tracks the progress of the entire benchmark suite.
Source code in aiperf/common/mixins/progress_tracker_mixin.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |
phase_progress_context(phase)
async
¶
Context manager for safely accessing phase progress info with warning.
Source code in aiperf/common/mixins/progress_tracker_mixin.py
204 205 206 207 208 209 210 211 212 213 214 215 216 | |
aiperf.common.mixins.pull_client_mixin¶
PullClientMixin
¶
Bases: CommunicationMixin, ABC
Mixin to provide a pull client for AIPerf components using a PullClient for the specified CommAddress. Add the @on_pull_message decorator to specify a function that will be called when a pull is received.
NOTE: This currently only supports a single pull client per service, as that is our current use case.
Source code in aiperf/common/mixins/pull_client_mixin.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | |
aiperf.common.mixins.reply_client_mixin¶
ReplyClientMixin
¶
Bases: CommunicationMixin, ABC
Mixin to provide a reply client for AIPerf components using a ReplyClient for the specified CommAddress. Add the @on_request decorator to specify a function that will be called when a request is received.
NOTE: This currently only supports a single reply client per service, as that is our current use case.
Source code in aiperf/common/mixins/reply_client_mixin.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
aiperf.common.mixins.task_manager_mixin¶
TaskManagerMixin
¶
Bases: AIPerfLoggerMixin
Mixin to manage a set of async tasks, and provide background task loop capabilities.
Can be used standalone, but it is most useful as part of the :class:AIPerfLifecycleMixin
mixin, where the lifecycle methods are automatically integrated with the task manager.
Source code in aiperf/common/mixins/task_manager_mixin.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | |
cancel_all_tasks(timeout=TASK_CANCEL_TIMEOUT_SHORT)
async
¶
Cancel all tasks in the set and wait for up to timeout seconds for them to complete.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeout
|
float
|
The timeout to wait for the tasks to complete. |
TASK_CANCEL_TIMEOUT_SHORT
|
Source code in aiperf/common/mixins/task_manager_mixin.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
execute_async(coro)
¶
Create a task from a coroutine and add it to the set of tasks, and return immediately. The task will be automatically cleaned up when it completes.
Source code in aiperf/common/mixins/task_manager_mixin.py
25 26 27 28 29 30 31 32 | |
start_background_task(method, interval=None, immediate=False, stop_on_error=False, stop_event=None)
¶
Run a task in the background, in a loop until cancelled.
Source code in aiperf/common/mixins/task_manager_mixin.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
wait_for_tasks()
async
¶
Wait for all current tasks to complete.
Source code in aiperf/common/mixins/task_manager_mixin.py
34 35 36 | |
aiperf.common.mixins.worker_tracker_mixin¶
WorkerTrackerMixin
¶
Bases: MessageBusClientMixin
A worker tracker that tracks the health and tasks of the workers.
Source code in aiperf/common/mixins/worker_tracker_mixin.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.models.base_models¶
AIPerfBaseModel
¶
Bases: BaseModel
Base model for all AIPerf Pydantic models. This class is configured to allow arbitrary types to be used as fields as to allow for more flexible model definitions by end users without breaking the existing code.
The @exclude_if_none decorator can also be used to specify which fields should be excluded from the serialized model if they are None. This is a workaround for the fact that pydantic does not support specifying exclude_none on a per-field basis.
Source code in aiperf/common/models/base_models.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
exclude_if_none(*field_names)
¶
Decorator to set the _exclude_if_none_fields class attribute to the set of field names that should be excluded if they are None.
Source code in aiperf/common/models/base_models.py
10 11 12 13 14 15 16 17 18 19 20 21 22 | |
aiperf.common.models.credit_models¶
CreditPhaseConfig
¶
Bases: AIPerfBaseModel
Model for phase credit config. This is used by the TimingManager to configure the credit phases.
Source code in aiperf/common/models/credit_models.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
is_valid
property
¶
A phase config is valid if it is exactly one of the following: - is_time_based (expected_duration_sec is set and > 0) - is_request_count_based (total_expected_requests is set and > 0)
CreditPhaseStats
¶
Bases: CreditPhaseConfig
Model for phase credit stats. Extends the CreditPhaseConfig fields to track the progress of the credit phases. How many credits were dropped and how many were returned, as well as the progress percentage of the phase.
Source code in aiperf/common/models/credit_models.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
in_flight
property
¶
Calculate the number of in-flight credits (sent but not completed).
should_send
property
¶
Whether the phase should send more credits.
from_phase_config(phase_config)
classmethod
¶
Create a CreditPhaseStats from a CreditPhaseConfig. This is used to initialize the stats for a phase.
Source code in aiperf/common/models/credit_models.py
125 126 127 128 129 130 131 132 | |
ProcessingStats
¶
Bases: AIPerfBaseModel
Model for phase processing stats. How many requests were processed and how many errors were encountered.
Source code in aiperf/common/models/credit_models.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | |
total_records
property
¶
The total number of records processed successfully or in error.
aiperf.common.models.dataset_models¶
Audio
¶
Bases: Media
Media that contains audio data.
Source code in aiperf/common/models/dataset_models.py
36 37 38 39 | |
Conversation
¶
Bases: AIPerfBaseModel
A dataset representation of a full conversation.
A conversation is a sequence of turns between a user and an endpoint, and it contains the session ID and all the turns that consists the conversation.
Source code in aiperf/common/models/dataset_models.py
73 74 75 76 77 78 79 80 81 82 83 | |
Image
¶
Bases: Media
Media that contains image data.
Source code in aiperf/common/models/dataset_models.py
30 31 32 33 | |
Media
¶
Bases: AIPerfBaseModel
Base class for all media fields. Contains name and contents of the media data.
Source code in aiperf/common/models/dataset_models.py
13 14 15 16 17 18 19 20 21 | |
Text
¶
Bases: Media
Media that contains text/prompt data.
Source code in aiperf/common/models/dataset_models.py
24 25 26 27 | |
Turn
¶
Bases: AIPerfBaseModel
A dataset representation of a single turn within a conversation.
A turn is a single interaction between a user and an AI assistant, and it contains timestamp, delay, and raw data that user sends in each turn.
Source code in aiperf/common/models/dataset_models.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
aiperf.common.models.error_models¶
ErrorDetails
¶
Bases: AIPerfBaseModel
Encapsulates details about an error.
Source code in aiperf/common/models/error_models.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
__eq__(other)
¶
Check if the error details are equal by comparing the code, type, and message.
Source code in aiperf/common/models/error_models.py
26 27 28 29 30 31 32 33 34 | |
__hash__()
¶
Hash the error details by hashing the code, type, and message.
Source code in aiperf/common/models/error_models.py
36 37 38 | |
from_exception(e)
classmethod
¶
Create an error details object from an exception.
Source code in aiperf/common/models/error_models.py
40 41 42 43 44 45 46 | |
ErrorDetailsCount
¶
Bases: AIPerfBaseModel
Count of error details.
Source code in aiperf/common/models/error_models.py
49 50 51 52 53 54 55 56 | |
aiperf.common.models.health_models¶
ProcessHealth
¶
Bases: AIPerfBaseModel
Model for process health data.
Source code in aiperf/common/models/health_models.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
aiperf.common.models.progress_models¶
Models for tracking the progress of the benchmark suite.
ComputedStats
¶
Bases: AIPerfBaseModel
Computed info for a phase (can be used for requests or records).
Source code in aiperf/common/models/progress_models.py
37 38 39 40 41 42 43 44 45 46 | |
FullPhaseProgress
¶
Bases: AIPerfBaseModel
Full state of the credit phase progress, including the progress of the phase, the processing stats, and the worker stats.
Source code in aiperf/common/models/progress_models.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
RecordsStats
¶
Bases: ComputedStats, ProcessingStats
Stats for the records. Based on the RecordsManager data.
Source code in aiperf/common/models/progress_models.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |
RequestsStats
¶
Bases: ComputedStats, CreditPhaseStats
Stats for the requests. Based on the TimingManager data.
Source code in aiperf/common/models/progress_models.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | |
StatsProtocol
¶
Bases: Protocol
Protocol for stats.
Source code in aiperf/common/models/progress_models.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | |
WorkerStats
¶
Bases: AIPerfBaseModel
Stats for a worker.
Source code in aiperf/common/models/progress_models.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | |
aiperf.common.models.record_models¶
InferenceServerResponse
¶
Bases: AIPerfBaseModel
Response from a inference client.
Source code in aiperf/common/models/record_models.py
91 92 93 94 95 96 97 | |
MetricResult
¶
Bases: AIPerfBaseModel
The result values of a single metric.
Source code in aiperf/common/models/record_models.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
ParsedResponseRecord
¶
Bases: AIPerfBaseModel
Record of a request and its associated responses, already parsed and ready for metrics.
Source code in aiperf/common/models/record_models.py
338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 | |
end_perf_ns
cached
property
¶
Get the end time of the request in nanoseconds (perf_counter_ns). If request.end_perf_ns is not set, use the time of the last response. If there are no responses, use sys.maxsize.
has_error
cached
property
¶
Check if the response record has an error.
request_duration_ns
cached
property
¶
Get the duration of the request in nanoseconds.
start_perf_ns
cached
property
¶
Get the start time of the request in nanoseconds (perf_counter_ns).
timestamp_ns
cached
property
¶
Get the wall clock timestamp of the request in nanoseconds. DO NOT USE FOR LATENCY CALCULATIONS. (time.time_ns).
tokens_per_second
cached
property
¶
Get the number of tokens per second of the request.
valid
cached
property
¶
Check if the response record is valid.
Checks: - Request has no errors - Has at least one response - Start time is before the end time - Response timestamps are within valid ranges
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the record is valid, False otherwise. |
ProcessRecordsResult
¶
Bases: AIPerfBaseModel
Result of the process records command.
Source code in aiperf/common/models/record_models.py
76 77 78 79 80 81 82 83 | |
RequestRecord
¶
Bases: AIPerfBaseModel
Record of a request with its associated responses.
Source code in aiperf/common/models/record_models.py
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 | |
delayed
property
¶
Check if the request was delayed.
has_error
property
¶
Check if the request record has an error.
inter_token_latency_ns
property
¶
Get the interval between responses in nanoseconds.
time_to_first_response_ns
property
¶
Get the time to the first response in nanoseconds.
time_to_last_response_ns
property
¶
Get the time to the last response in nanoseconds.
time_to_second_response_ns
property
¶
Get the time to the second response in nanoseconds.
valid
property
¶
Check if the request record is valid by ensuring that the start time and response timestamps are within valid ranges.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the record is valid, False otherwise. |
token_latency_ns(index)
¶
Get the latency of a token in nanoseconds.
Source code in aiperf/common/models/record_models.py
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 | |
ResponseData
¶
Bases: AIPerfBaseModel
Base class for all response data.
Source code in aiperf/common/models/record_models.py
321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 | |
SSEField
¶
Bases: AIPerfBaseModel
Base model for a single field in an SSE message.
Source code in aiperf/common/models/record_models.py
113 114 115 116 117 118 119 120 121 122 123 | |
SSEMessage
¶
Bases: InferenceServerResponse
Individual SSE message from an SSE stream. Delimited by
.
Source code in aiperf/common/models/record_models.py
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | |
extract_data_content()
¶
Extract the data contents from the SSE message as a list of strings. Note that the SSE spec specifies that each data content should be combined and delimited by a single . We have left it as a list to allow the caller to decide how to handle the data.
Returns:
list[str]: A list of strings containing the data contents of the SSE message.
Source code in aiperf/common/models/record_models.py
135 136 137 138 139 140 141 142 143 144 145 146 147 | |
TextResponse
¶
Bases: InferenceServerResponse
Raw text response from a inference client including an optional content type.
Source code in aiperf/common/models/record_models.py
100 101 102 103 104 105 106 107 108 109 110 | |
aiperf.common.models.service_models¶
ServiceRunInfo
¶
Bases: AIPerfBaseModel
Base model for tracking service run information.
Source code in aiperf/common/models/service_models.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
aiperf.common.models.worker_models¶
WorkerTaskStats
¶
Bases: AIPerfBaseModel
Stats for the tasks that have been sent to the worker.
Source code in aiperf/common/models/worker_models.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
in_progress
property
¶
The number of tasks that are currently in progress.
This is the total number of tasks sent to the worker minus the number of failed and successfully completed tasks.
aiperf.common.protocols¶
AIPerfLifecycleProtocol
¶
Bases: TaskManagerProtocol, Protocol
Protocol for AIPerf lifecycle methods.
see :class:aiperf.common.mixins.aiperf_lifecycle_mixin.AIPerfLifecycleMixin for more details.
Source code in aiperf/common/protocols.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
AIPerfUIProtocol
¶
Bases: AIPerfLifecycleProtocol, Protocol
Protocol interface definition for AIPerf UI implementations.
Basically a UI can be any class that implements the AIPerfLifecycleProtocol. However, in order to provide
progress tracking and worker tracking, the simplest way would be to inherit from the :class:aiperf.ui.base_ui.BaseAIPerfUI.
Source code in aiperf/common/protocols.py
301 302 303 304 305 306 307 | |
CommunicationProtocol
¶
Bases: AIPerfLifecycleProtocol, Protocol
Protocol for the base communication layer.
see :class:aiperf.common.comms.base_comms.BaseCommunication for more details.
Source code in aiperf/common/protocols.py
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 | |
create_client(client_type, address, bind=False, socket_ops=None, max_pull_concurrency=None)
¶
Create a client for the given client type and address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
212 213 214 215 216 217 218 219 220 221 222 | |
create_pub_client(address, bind=False, socket_ops=None)
¶
Create a PUB client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
224 225 226 227 228 229 230 231 232 | |
create_pull_client(address, bind=False, socket_ops=None, max_pull_concurrency=None)
¶
Create a PULL client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
254 255 256 257 258 259 260 261 262 263 | |
create_push_client(address, bind=False, socket_ops=None)
¶
Create a PUSH client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
244 245 246 247 248 249 250 251 252 | |
create_reply_client(address, bind=False, socket_ops=None)
¶
Create a REPLY client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
275 276 277 278 279 280 281 282 283 | |
create_request_client(address, bind=False, socket_ops=None)
¶
Create a REQUEST client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
265 266 267 268 269 270 271 272 273 | |
create_sub_client(address, bind=False, socket_ops=None)
¶
Create a SUB client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
234 235 236 237 238 239 240 241 242 | |
ConsoleExporterProtocol
¶
Bases: Protocol
Protocol for console exporters.
Any class implementing this protocol will be provided an ExporterConfig and must provide an
export method that takes a rich Console and handles exporting them appropriately.
Source code in aiperf/common/protocols.py
310 311 312 313 314 315 316 317 318 319 | |
DataExporterProtocol
¶
Bases: Protocol
Protocol for data exporters.
Any class implementing this protocol will be provided an ExporterConfig and must provide an
export method that handles exporting the data appropriately.
Source code in aiperf/common/protocols.py
322 323 324 325 326 327 328 329 330 331 332 | |
HooksProtocol
¶
Bases: Protocol
Protocol for hooks methods provided by the HooksMixin.
Source code in aiperf/common/protocols.py
335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | |
get_hooks(*hook_types, reversed=False)
¶
Get the hooks for the given hook type(s), optionally reversed.
Source code in aiperf/common/protocols.py
339 340 341 | |
run_hooks(*hook_types, reversed=False, **kwargs)
async
¶
Run the hooks for the given hook type, waiting for each hook to complete before running the next one. If reversed is True, the hooks will be run in reverse order. This is useful for stop/cleanup starting with the children and ending with the parent.
Source code in aiperf/common/protocols.py
343 344 345 346 347 348 349 350 | |
InferenceClientProtocol
¶
Bases: Protocol
Protocol for an inference server client.
This protocol defines the methods that must be implemented by any inference server client implementation that is compatible with the AIPerf framework.
Source code in aiperf/common/protocols.py
353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 | |
__init__(model_endpoint)
¶
Create a new inference server client based on the provided configuration.
Source code in aiperf/common/protocols.py
361 362 363 | |
close()
async
¶
Close the client.
Source code in aiperf/common/protocols.py
386 387 388 | |
initialize()
async
¶
Initialize the inference server client in an asynchronous context.
Source code in aiperf/common/protocols.py
365 366 367 | |
send_request(model_endpoint, payload)
async
¶
Send a request to the inference server.
This method is used to send a request to the inference server.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_endpoint
|
ModelEndpointInfoT
|
The endpoint to send the request to. |
required |
payload
|
RequestInputT
|
The payload to send to the inference server. |
required |
Returns: The raw response from the inference server.
Source code in aiperf/common/protocols.py
369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 | |
MessageBusClientProtocol
¶
Bases: PubClientProtocol, SubClientProtocol, Protocol
A message bus client is a client that can publish and subscribe to messages on the event bus/message bus.
Source code in aiperf/common/protocols.py
286 287 288 289 290 291 292 293 | |
RecordProcessorProtocol
¶
Bases: Protocol
Protocol for a record processor that processes the incoming records and returns the results of the post processing.
Source code in aiperf/common/protocols.py
476 477 478 479 480 481 482 | |
RequestConverterProtocol
¶
Bases: Protocol
Protocol for a request converter that converts a raw request to a formatted request for the inference server.
Source code in aiperf/common/protocols.py
403 404 405 406 407 408 409 410 411 | |
format_payload(model_endpoint, turn)
async
¶
Format the turn for the inference server.
Source code in aiperf/common/protocols.py
407 408 409 410 411 | |
ResponseExtractorProtocol
¶
Bases: Protocol
Protocol for a response extractor that extracts the response data from a raw inference server response and converts it to a list of ResponseData objects.
Source code in aiperf/common/protocols.py
391 392 393 394 395 396 397 398 399 400 | |
extract_response_data(record, tokenizer)
async
¶
Extract the response data from a raw inference server response and convert it to a list of ResponseData objects.
Source code in aiperf/common/protocols.py
396 397 398 399 400 | |
ResultsProcessorProtocol
¶
Bases: Protocol
Protocol for a results processor that processes the results of multiple record processors, and provides the ability to summarize the results.
Source code in aiperf/common/protocols.py
485 486 487 488 489 490 491 492 493 494 | |
ServiceManagerProtocol
¶
Bases: AIPerfLifecycleProtocol, Protocol
Protocol for a service manager that manages the running of services using the specific ServiceRunType.
Abstracts away the details of service deployment and management.
see :class:aiperf.controller.base_service_manager.BaseServiceManager for more details.
Source code in aiperf/common/protocols.py
414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 | |
ServiceProtocol
¶
Bases: MessageBusClientProtocol, Protocol
Protocol for a service. Essentially a MessageBusClientProtocol with a service_type and service_id attributes.
Source code in aiperf/common/protocols.py
460 461 462 463 464 465 466 467 468 469 470 471 472 473 | |
aiperf.common.tokenizer¶
Tokenizer
¶
This class provides a simplified interface for using Huggingface tokenizers, with default arguments for common operations.
Source code in aiperf/common/tokenizer.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
bos_token_id
property
¶
Return the beginning-of-sequence (BOS) token ID.
__call__(text, **kwargs)
¶
Call the underlying Huggingface tokenizer with default arguments, which can be overridden by kwargs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
The input text to tokenize. |
required |
Returns:
| Type | Description |
|---|---|
BatchEncoding
|
A BatchEncoding object containing the tokenized output. |
Source code in aiperf/common/tokenizer.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 | |
__init__()
¶
Initialize the tokenizer with default values for call, encode, and decode.
Source code in aiperf/common/tokenizer.py
28 29 30 31 32 33 34 35 | |
__repr__()
¶
Return a string representation of the underlying tokenizer.
Returns:
| Type | Description |
|---|---|
str
|
The string representation of the tokenizer. |
Source code in aiperf/common/tokenizer.py
119 120 121 122 123 124 125 126 | |
__str__()
¶
Return a user-friendly string representation of the underlying tokenizer.
Returns:
| Type | Description |
|---|---|
str
|
The string representation of the tokenizer. |
Source code in aiperf/common/tokenizer.py
128 129 130 131 132 133 134 135 | |
decode(token_ids, **kwargs)
¶
Decode a list of token IDs back into a string.
This method calls the underlying Huggingface tokenizer's decode method with default arguments, which can be overridden by kwargs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
token_ids
|
A list of token IDs to decode. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The decoded string. |
Source code in aiperf/common/tokenizer.py
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
encode(text, **kwargs)
¶
Encode the input text into a list of token IDs.
This method calls the underlying Huggingface tokenizer's encode method with default arguments, which can be overridden by kwargs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
The input text to encode. |
required |
Returns:
| Type | Description |
|---|---|
list[int]
|
A list of token IDs. |
Source code in aiperf/common/tokenizer.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | |
from_pretrained(name, trust_remote_code=False, revision='main')
classmethod
¶
Factory to load a tokenizer for the given pretrained model name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name or path of the pretrained tokenizer model. |
required |
trust_remote_code
|
bool
|
Whether to trust remote code when loading the tokenizer. |
False
|
revision
|
str
|
The specific model version to use. |
'main'
|
Source code in aiperf/common/tokenizer.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
aiperf.common.types¶
This module defines common used alias types for AIPerf. This both helps prevent circular imports and helps with type hinting.
aiperf.common.utils¶
call_all_functions(funcs, *args, **kwargs)
async
¶
Call all functions in the list with the given name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
The object to call the functions on. |
required | |
func_names
|
The names of the functions to call. |
required | |
*args
|
The arguments to pass to the functions. |
()
|
|
**kwargs
|
The keyword arguments to pass to the functions. |
{}
|
Raises:
| Type | Description |
|---|---|
AIPerfMultiError
|
If any of the functions raise an exception. |
Source code in aiperf/common/utils.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
call_all_functions_self(self_, funcs, *args, **kwargs)
async
¶
Call all functions in the list with the given name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
The object to call the functions on. |
required | |
func_names
|
The names of the functions to call. |
required | |
*args
|
The arguments to pass to the functions. |
()
|
|
**kwargs
|
The keyword arguments to pass to the functions. |
{}
|
Raises:
| Type | Description |
|---|---|
AIPerfMultiError
|
If any of the functions raise an exception. |
Source code in aiperf/common/utils.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
load_json_str(json_str, func=lambda x: x)
¶
Deserializes JSON encoded string into Python object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
-
|
json_str
|
string JSON encoded string |
required |
-
|
func
|
callable A function that takes deserialized JSON object. This can be used to run validation checks on the object. Defaults to identity function. |
required |
Source code in aiperf/common/utils.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | |
yield_to_event_loop()
async
¶
Yield to the event loop. This forces the current coroutine to yield and allow other coroutines to run, preventing starvation. Use this when you do not want to delay your coroutine via sleep, but still want to allow other coroutines to run if there is a potential for an infinite loop.
Source code in aiperf/common/utils.py
101 102 103 104 105 106 107 | |
aiperf.controller.base_service_manager¶
BaseServiceManager
¶
Bases: AIPerfLifecycleMixin, ABC
Base class for service managers. It provides a common interface for managing services.
Source code in aiperf/controller/base_service_manager.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
stop_services_by_type(service_types)
async
¶
Stop a set of services.
Source code in aiperf/controller/base_service_manager.py
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | |
aiperf.controller.kubernetes_service_manager¶
KubernetesServiceManager
¶
Bases: BaseServiceManager
Service Manager for starting and stopping services in a Kubernetes cluster.
Source code in aiperf/controller/kubernetes_service_manager.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
kill_all_services()
async
¶
Kill all required services as Kubernetes pods.
Source code in aiperf/controller/kubernetes_service_manager.py
62 63 64 65 66 67 68 | |
run_service(service_type, num_replicas=1)
async
¶
Run a service as a Kubernetes pod.
Source code in aiperf/controller/kubernetes_service_manager.py
44 45 46 47 48 49 50 51 52 | |
shutdown_all_services()
async
¶
Stop all required services as Kubernetes pods.
Source code in aiperf/controller/kubernetes_service_manager.py
54 55 56 57 58 59 60 | |
wait_for_all_services_registration(stop_event, timeout_seconds=DEFAULT_SERVICE_REGISTRATION_TIMEOUT)
async
¶
Wait for all required services to be registered in Kubernetes.
Source code in aiperf/controller/kubernetes_service_manager.py
70 71 72 73 74 75 76 77 78 79 80 81 82 | |
wait_for_all_services_start(stop_event, timeout_seconds=DEFAULT_SERVICE_START_TIMEOUT)
async
¶
Wait for all required services to be started in Kubernetes.
Source code in aiperf/controller/kubernetes_service_manager.py
84 85 86 87 88 89 90 91 92 93 94 95 96 | |
ServiceKubernetesRunInfo
¶
Bases: BaseModel
Information about a service running in a Kubernetes pod.
Source code in aiperf/controller/kubernetes_service_manager.py
20 21 22 23 24 25 | |
aiperf.controller.multiprocess_service_manager¶
MultiProcessRunInfo
¶
Bases: BaseModel
Information about a service running as a multiprocessing process.
Source code in aiperf/controller/multiprocess_service_manager.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
MultiProcessServiceManager
¶
Bases: BaseServiceManager
Service Manager for starting and stopping services as multiprocessing processes.
Source code in aiperf/controller/multiprocess_service_manager.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | |
kill_all_services()
async
¶
Kill all required services as multiprocessing processes.
Source code in aiperf/controller/multiprocess_service_manager.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 | |
run_service(service_type, num_replicas=1)
async
¶
Run a service with the given number of replicas.
Source code in aiperf/controller/multiprocess_service_manager.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
shutdown_all_services()
async
¶
Stop all required services as multiprocessing processes.
Source code in aiperf/controller/multiprocess_service_manager.py
114 115 116 117 118 119 120 121 122 | |
wait_for_all_services_registration(stop_event, timeout_seconds=DEFAULT_SERVICE_REGISTRATION_TIMEOUT)
async
¶
Wait for all required services to be registered.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stop_event
|
Event
|
Event to check if operation should be cancelled |
required |
timeout_seconds
|
float
|
Maximum time to wait in seconds |
DEFAULT_SERVICE_REGISTRATION_TIMEOUT
|
Source code in aiperf/controller/multiprocess_service_manager.py
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | |
wait_for_all_services_start(stop_event, timeout_seconds=DEFAULT_SERVICE_START_TIMEOUT)
async
¶
Wait for all required services to be started.
Source code in aiperf/controller/multiprocess_service_manager.py
215 216 217 218 219 220 221 222 223 224 | |
aiperf.controller.proxy_manager¶
aiperf.controller.system_controller¶
SystemController
¶
Bases: SignalHandlerMixin, BaseService
System Controller service.
This service is responsible for managing the lifecycle of all other services. It will start, stop, and configure all other services.
Source code in aiperf/controller/system_controller.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 | |
initialize()
async
¶
We need to override the initialize method to run the proxy manager before the base service initialize. This is because the proxies need to be running before we can subscribe to the message bus.
Source code in aiperf/controller/system_controller.py
117 118 119 120 121 122 123 124 | |
main()
¶
Main entry point for the system controller.
Source code in aiperf/controller/system_controller.py
434 435 436 437 438 439 | |
aiperf.controller.system_mixins¶
SignalHandlerMixin
¶
Bases: AIPerfLoggerMixin
Mixin for services that need to handle system signals.
Source code in aiperf/controller/system_mixins.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
setup_signal_handlers(callback)
¶
This method will set up signal handlers for the SIGTERM and SIGINT signals in order to trigger a graceful shutdown of the service.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callback
|
Callable[[int], Coroutine]
|
The callback to call when a signal is received |
required |
Source code in aiperf/controller/system_mixins.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
aiperf.dataset.composer.base¶
BaseDatasetComposer
¶
Bases: AIPerfLoggerMixin, ABC
Source code in aiperf/dataset/composer/base.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
create_dataset()
abstractmethod
¶
Create a set of conversation objects from the given configuration.
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
list[Conversation]: A list of conversation objects. |
Source code in aiperf/dataset/composer/base.py
29 30 31 32 33 34 35 36 37 | |
aiperf.dataset.composer.custom¶
CustomDatasetComposer
¶
Bases: BaseDatasetComposer
Source code in aiperf/dataset/composer/custom.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
create_dataset()
¶
Create conversations from a file or directory.
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
list[Conversation]: A list of conversation objects. |
Source code in aiperf/dataset/composer/custom.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 | |
aiperf.dataset.composer.synthetic¶
SyntheticDatasetComposer
¶
Bases: BaseDatasetComposer
Source code in aiperf/dataset/composer/synthetic.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | |
create_dataset()
¶
Create a synthetic conversation dataset from the given configuration.
It generates a set of conversations with a varying number of turns, where each turn contains synthetic text, image, and audio payloads.
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
list[Conversation]: A list of conversation objects. |
Source code in aiperf/dataset/composer/synthetic.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
aiperf.dataset.dataset_manager¶
DatasetManager
¶
Bases: ReplyClientMixin, BaseComponentService
The DatasetManager primary responsibility is to manage the data generation or acquisition. For synthetic generation, it contains the code to generate the prompts or tokens. It will have an API for dataset acquisition of a dataset if available in a remote repository or database.
Source code in aiperf/dataset/dataset_manager.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | |
main()
¶
Main entry point for the dataset manager.
Source code in aiperf/dataset/dataset_manager.py
255 256 257 258 259 260 | |
aiperf.dataset.generator.audio¶
AudioGenerator
¶
Bases: BaseGenerator
A class for generating synthetic audio data.
This class provides methods to create audio samples with specified characteristics such as format (WAV, MP3), length, sampling rate, bit depth, and number of channels. It supports validation of audio parameters to ensure compatibility with chosen formats.
Source code in aiperf/dataset/generator/audio.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
generate(*args, **kwargs)
¶
Generate audio data with specified parameters.
Returns:
| Type | Description |
|---|---|
str
|
Data URI containing base64-encoded audio data with format specification |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If any of the following conditions are met: - audio length is less than 0.01 seconds - channels is not 1 (mono) or 2 (stereo) - sampling rate is not supported for MP3 format - bit depth is not supported (must be 8, 16, 24, or 32) - audio format is not supported (must be 'wav' or 'mp3') |
Source code in aiperf/dataset/generator/audio.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
aiperf.dataset.generator.base¶
BaseGenerator
¶
Bases: AIPerfLoggerMixin, ABC
Abstract base class for all data generators.
Provides a consistent interface for generating synthetic data while allowing each generator type to use its own specific configuration and runtime parameters.
Source code in aiperf/dataset/generator/base.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | |
generate(*args, **kwargs)
abstractmethod
¶
Generate synthetic data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
Variable length argument list (subclass-specific) |
()
|
|
**kwargs
|
Arbitrary keyword arguments (subclass-specific) |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
Generated data as a string (could be text, base64 encoded media, etc.) |
Source code in aiperf/dataset/generator/base.py
16 17 18 19 20 21 22 23 24 25 26 27 | |
aiperf.dataset.generator.image¶
ImageGenerator
¶
Bases: BaseGenerator
A class that generates images from source images.
This class provides methods to create synthetic images by resizing source images (located in the 'assets/source_images' directory) to specified dimensions and converting them to a chosen image format (e.g., PNG, JPEG). The dimensions can be randomized based on mean and standard deviation values.
Source code in aiperf/dataset/generator/image.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
generate(*args, **kwargs)
¶
Generate an image with the configured parameters.
Returns:
| Type | Description |
|---|---|
str
|
A base64 encoded string of the generated image. |
Source code in aiperf/dataset/generator/image.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
aiperf.dataset.generator.prompt¶
PromptGenerator
¶
Bases: BaseGenerator
A class for generating synthetic prompts from a text corpus.
This class loads a text corpus (e.g., Shakespearean text), tokenizes it, and uses the tokenized corpus to generate synthetic prompts of specified lengths. It supports generating prompts with a target number of tokens (with optional randomization around a mean and standard deviation) and can reuse previously generated token blocks to optimize generation for certain use cases. It also allows for the creation of a pool of prefix prompts that can be randomly selected.
Source code in aiperf/dataset/generator/prompt.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 | |
generate(mean=None, stddev=None, hash_ids=None)
¶
Generate a synthetic prompt with the configuration parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
int | None
|
The mean of the normal distribution. |
None
|
stddev
|
int | None
|
The standard deviation of the normal distribution. |
None
|
hash_ids
|
list[int] | None
|
A list of hash indices used for token reuse. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
A synthetic prompt as a string. |
Source code in aiperf/dataset/generator/prompt.py
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | |
get_random_prefix_prompt()
¶
Fetch a random prefix prompt from the pool.
Returns:
| Type | Description |
|---|---|
str
|
A random prefix prompt. |
Raises:
| Type | Description |
|---|---|
InvalidStateError
|
If the prefix prompts pool is empty. |
Source code in aiperf/dataset/generator/prompt.py
215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 | |
aiperf.dataset.loader.mixins¶
MediaConversionMixin
¶
Mixin providing shared media conversion functionality for dataset loaders. It is used to construct text, image, and audio data from a CustomDatasetT object.
Source code in aiperf/dataset/loader/mixins.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | |
convert_to_media_objects(data, name='')
¶
Convert all custom dataset into media objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
CustomDatasetT
|
The custom dataset to convert into media objects. |
required |
name
|
str
|
The name of the media field. |
''
|
Returns:
| Type | Description |
|---|---|
dict[str, list[MediaT]]
|
A dictionary of media objects. |
Source code in aiperf/dataset/loader/mixins.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
aiperf.dataset.loader.models¶
CustomDatasetT = TypeVar('CustomDatasetT', bound=(SingleTurn | MultiTurn | RandomPool | MooncakeTrace))
module-attribute
¶
A union type of all custom data types.
MooncakeTrace
¶
Bases: AIPerfBaseModel
Defines the schema for Mooncake trace data.
See https://github.com/kvcache-ai/Mooncake for more details.
Example:
{"timestamp": 1000, "input_length": 10, "output_length": 4, "hash_ids": [123, 456]}
Source code in aiperf/dataset/loader/models.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
MultiTurn
¶
Bases: AIPerfBaseModel
Defines the schema for multi-turn conversations.
The multi-turn custom dataset - supports multi-modal data (e.g. text, image, audio) - supports multi-turn features (e.g. delay, sessions, etc.) - supports client-side batching for each data (e.g. batch size > 1)
Source code in aiperf/dataset/loader/models.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
validate_turns_not_empty()
¶
Ensure at least one turn is provided
Source code in aiperf/dataset/loader/models.py
92 93 94 95 96 97 | |
RandomPool
¶
Bases: AIPerfBaseModel
Defines the schema for random pool data entry.
The random pool custom dataset - supports multi-modal data (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch size > 1) - supports named fields for each modality (e.g. text_field_a, text_field_b, etc.) - DOES NOT support multi-turn or its features (e.g. delay, sessions, etc.)
Source code in aiperf/dataset/loader/models.py
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |
validate_at_least_one_modality()
¶
Ensure at least one modality is provided
Source code in aiperf/dataset/loader/models.py
139 140 141 142 143 144 145 146 | |
validate_mutually_exclusive_fields()
¶
Ensure mutually exclusive fields are not set together
Source code in aiperf/dataset/loader/models.py
128 129 130 131 132 133 134 135 136 137 | |
SingleTurn
¶
Bases: AIPerfBaseModel
Defines the schema for single-turn data.
User can use this format to quickly provide a custom single turn dataset. Each line in the file will be treated as a single turn conversation.
The single turn type - supports multi-modal (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch_size > 1) - DOES NOT support multi-turn features (e.g. session_id)
Source code in aiperf/dataset/loader/models.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
validate_at_least_one_modality()
¶
Ensure at least one modality is provided
Source code in aiperf/dataset/loader/models.py
64 65 66 67 68 69 70 71 | |
validate_mutually_exclusive_fields()
¶
Ensure mutually exclusive fields are not set together
Source code in aiperf/dataset/loader/models.py
51 52 53 54 55 56 57 58 59 60 61 62 | |
aiperf.dataset.loader.mooncake_trace¶
MooncakeTraceDatasetLoader
¶
Bases: AIPerfLoggerMixin
A dataset loader that loads Mooncake trace data from a file.
Loads Mooncake trace data from a file and converts the data into a list of conversations for dataset manager.
Each line in the file represents a single trace entry and will be converted to a separate conversation with a unique session ID.
Example: Fixed schedule version (Each line is a distinct session. Multi-turn is NOT supported)
{"timestamp": 1000, "input_length": 300, "output_length": 40, "hash_ids": [123, 456]}
Source code in aiperf/dataset/loader/mooncake_trace.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |
convert_to_conversations(data)
¶
Convert all the Mooncake trace data to conversation objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, list[MooncakeTrace]]
|
A dictionary of session_id and list of Mooncake trace data. |
required |
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
A list of conversations. |
Source code in aiperf/dataset/loader/mooncake_trace.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |
load_dataset()
¶
Load Mooncake trace data from a file.
Returns:
| Type | Description |
|---|---|
dict[str, list[MooncakeTrace]]
|
A dictionary of session_id and list of Mooncake trace data. |
Source code in aiperf/dataset/loader/mooncake_trace.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | |
aiperf.dataset.loader.multi_turn¶
MultiTurnDatasetLoader
¶
Bases: MediaConversionMixin
A dataset loader that loads multi-turn data from a file.
The multi-turn type - supports multi-modal data (e.g. text, image, audio) - supports multi-turn features (e.g. delay, sessions, etc.) - supports client-side batching for each data (e.g. batch_size > 1)
NOTE: If the user specifies multiple multi-turn entries with same session ID, the loader will group them together. If the timestamps are specified, they will be sorted in ascending order later in the timing manager.
Examples: 1. Simple version
{
"session_id": "session_123",
"turns": [
{"text": "Hello", "image": "url", "delay": 0},
{"text": "Hi there", "delay": 1000}
]
}
- Batched version
{
"session_id": "session_123",
"turns": [
{"texts": ["Who are you?", "Hello world"], "images": ["/path/1.png", "/path/2.png"]},
{"texts": ["What is in the image?", "What is AI?"], "images": ["/path/3.png", "/path/4.png"]}
]
}
- Fixed schedule version
{
"session_id": "session_123",
"turns": [
{"timestamp": 0, "text": "What is deep learning?"},
{"timestamp": 1000, "text": "Who are you?"}
]
}
- Time delayed version
{
"session_id": "session_123",
"turns": [
{"delay": 0, "text": "What is deep learning?"},
{"delay": 1000, "text": "Who are you?"}
]
}
- full-featured version (multi-batch, multi-modal, multi-fielded, session-based, etc.)
{
"session_id": "session_123",
"turns": [
{
"timestamp": 1234,
"texts": [
{"name": "text_field_a", "contents": ["hello", "world"]},
{"name": "text_field_b", "contents": ["hi there"]}
],
"images": [
{"name": "image_field_a", "contents": ["/path/1.png", "/path/2.png"]},
{"name": "image_field_b", "contents": ["/path/3.png"]}
]
}
]
}
Source code in aiperf/dataset/loader/multi_turn.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | |
convert_to_conversations(data)
¶
Convert multi-turn data to conversation objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, list[MultiTurn]]
|
A dictionary mapping session_id to list of MultiTurn objects. |
required |
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
A list of conversations. |
Source code in aiperf/dataset/loader/multi_turn.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | |
load_dataset()
¶
Load multi-turn data from a JSONL file.
Each line represents a complete multi-turn conversation with its own session_id and multiple turns.
Returns:
| Type | Description |
|---|---|
dict[str, list[MultiTurn]]
|
A dictionary mapping session_id to list of MultiTurn objects. |
Source code in aiperf/dataset/loader/multi_turn.py
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
aiperf.dataset.loader.protocol¶
aiperf.dataset.loader.random_pool¶
RandomPoolDatasetLoader
¶
Bases: MediaConversionMixin
A dataset loader that loads data from a single file or a directory.
Each line in the file represents single-turn conversation data, and files create individual pools for random sampling: - Single file: All lines form one single pool (to be randomly sampled from) - Directory: Each file becomes a separate pool, then pools are randomly sampled and merged into conversations later.
The random pool custom dataset - supports multi-modal data (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch size > 1) - supports named fields for each modality (e.g. text_field_a, text_field_b, etc.) - DOES NOT support multi-turn or its features (e.g. delay, sessions, etc.)
Example:
- Single file
{"text": "Who are you?", "image": "/path/to/image1.png"}
{"text": "Explain what is the meaning of life.", "image": "/path/to/image2.png"}
...
The file will form a single pool of text and image data that will be used to generate conversations.
- Directory
Directory will be useful if user wants to - create multiple pools of different modalities separately (e.g. text, image) - specify different field names for the same modality.
data/queries.jsonl
{"texts": [{"name": "query", "contents": ["Who are you?"]}]}
{"texts": [{"name": "query", "contents": ["What is the meaning of life?"]}]}
...
data/passages.jsonl
{"texts": [{"name": "passage", "contents": ["I am a cat."]}]}
{"texts": [{"name": "passage", "contents": ["I am a dog."]}]}
...
The loader will create two separate pools for each file: queries and passages. Each pool is a text dataset with a different field name (e.g. query, passage), and loader will later sample from these two pools to create conversations.
Source code in aiperf/dataset/loader/random_pool.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 | |
convert_to_conversations(data)
¶
Convert random pool data to conversation objects.
Each RandomPool entry becomes a single-turn conversation with a unique session ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[Filename, list[RandomPool]]
|
A dictionary mapping filename to list of RandomPool objects. |
required |
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
A list of conversations. |
Source code in aiperf/dataset/loader/random_pool.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | |
load_dataset()
¶
Load random pool data from a file or directory.
If filename is a file, reads and parses using the RandomPool model. If filename is a directory, reads each file in the directory and merges items with different modality names into combined RandomPool objects.
Returns:
| Type | Description |
|---|---|
dict[Filename, list[RandomPool]]
|
A dictionary mapping filename to list of RandomPool objects. |
Source code in aiperf/dataset/loader/random_pool.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |
aiperf.dataset.loader.single_turn¶
SingleTurnDatasetLoader
¶
Bases: MediaConversionMixin
A dataset loader that loads single turn data from a file.
The single turn type - supports multi-modal data (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch_size > 1) - DOES NOT support multi-turn features (e.g. delay, sessions, etc.)
Examples: 1. Single-batch, text only
{"text": "What is deep learning?"}
- Single-batch, multi-modal
{"text": "What is in the image?", "image": "/path/to/image.png"}
- Multi-batch, multi-modal
{"texts": ["Who are you?", "Hello world"], "images": ["/path/to/image.png", "/path/to/image2.png"]}
- Fixed schedule version
{"timestamp": 0, "text": "What is deep learning?"},
{"timestamp": 1000, "text": "Who are you?"},
{"timestamp": 2000, "text": "What is AI?"}
- Time delayed version
{"delay": 0, "text": "What is deep learning?"},
{"delay": 1234, "text": "Who are you?"}
- Full-featured version (Multi-batch, multi-modal, multi-fielded)
{
"texts": [
{"name": "text_field_A", "contents": ["Hello", "World"]},
{"name": "text_field_B", "contents": ["Hi there"]}
],
"images": [
{"name": "image_field_A", "contents": ["/path/1.png", "/path/2.png"]},
{"name": "image_field_B", "contents": ["/path/3.png"]}
]
}
Source code in aiperf/dataset/loader/single_turn.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | |
convert_to_conversations(data)
¶
Convert single turn data to conversation objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, list[SingleTurn]]
|
A dictionary mapping session_id to list of SingleTurn objects. |
required |
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
A list of conversations. |
Source code in aiperf/dataset/loader/single_turn.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | |
load_dataset()
¶
Load single-turn data from a JSONL file.
Each line represents a single turn conversation. Multiple turns with the same session_id (or generated UUID) are grouped together.
Returns:
| Type | Description |
|---|---|
dict[str, list[SingleTurn]]
|
A dictionary mapping session_id to list of CustomData. |
Source code in aiperf/dataset/loader/single_turn.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | |
aiperf.dataset.utils¶
check_file_exists(filename)
¶
Verifies that the file exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
The file path to verify. |
required |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
Source code in aiperf/dataset/utils.py
18 19 20 21 22 23 24 25 26 27 28 | |
encode_image(img, format)
¶
Encodes an image into base64 encoded string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
img
|
Image
|
The PIL Image object to encode. |
required |
format
|
str
|
The image format to use (e.g., "JPEG", "PNG"). |
required |
Returns:
| Type | Description |
|---|---|
str
|
A base64 encoded string representation of the image. |
Source code in aiperf/dataset/utils.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | |
load_json_str(json_str, func=lambda x: x)
¶
Deserializes JSON encoded string into Python object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
json_str
|
str
|
JSON encoded string |
required |
func
|
Callable
|
A function that takes deserialized JSON object. This can be used to run validation checks on the object. Defaults to identity function. |
lambda x: x
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
The deserialized JSON object. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the JSON string is invalid. |
Source code in aiperf/dataset/utils.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
open_image(filename)
¶
Opens an image file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
The file path to open. |
required |
Returns:
| Type | Description |
|---|---|
Image
|
The opened PIL Image object. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
Source code in aiperf/dataset/utils.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
sample_normal(mean, stddev, lower=-np.inf, upper=np.inf)
¶
Sample from a normal distribution with support for bounds using rejection sampling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
float
|
The mean of the normal distribution. |
required |
stddev
|
float
|
The standard deviation of the normal distribution. |
required |
lower
|
float
|
The lower bound of the distribution. |
-inf
|
upper
|
float
|
The upper bound of the distribution. |
inf
|
Returns:
| Type | Description |
|---|---|
int
|
An integer sampled from the distribution. |
Source code in aiperf/dataset/utils.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
sample_positive_normal(mean, stddev)
¶
Sample from a normal distribution ensuring positive values without distorting the distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
float
|
Mean value for the normal distribution |
required |
stddev
|
float
|
Standard deviation for the normal distribution |
required |
Returns:
| Type | Description |
|---|---|
float
|
A positive sample from the normal distribution |
Raises:
| Type | Description |
|---|---|
ValueError
|
If mean is less than 0 |
Source code in aiperf/dataset/utils.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
sample_positive_normal_integer(mean, stddev)
¶
Sample a random positive integer from a normal distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
float
|
The mean of the normal distribution. |
required |
stddev
|
float
|
The standard deviation of the normal distribution. |
required |
Returns:
| Type | Description |
|---|---|
int
|
A positive integer sampled from the distribution. If the sampled |
int
|
number is less than 1, it returns 1. |
Source code in aiperf/dataset/utils.py
138 139 140 141 142 143 144 145 146 147 148 149 | |
aiperf.exporters.console_error_exporter¶
ConsoleErrorExporter
¶
A class that exports error data to the console
Source code in aiperf/exporters/console_error_exporter.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
aiperf.exporters.console_metrics_exporter¶
ConsoleMetricsExporter
¶
Bases: AIPerfLoggerMixin
A class that exports data to the console
Source code in aiperf/exporters/console_metrics_exporter.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | |
aiperf.exporters.display_units_utils¶
convert_all_metrics_to_display_units(records, registry)
¶
Helper for exporters that want a tag->result mapping in display units.
Source code in aiperf/exporters/display_units_utils.py
65 66 67 68 69 70 71 72 | |
to_display_unit(result, registry)
¶
Return a new MetricResult converted to its display unit (if different).
Source code in aiperf/exporters/display_units_utils.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
aiperf.exporters.exporter_config¶
aiperf.exporters.exporter_manager¶
ExporterManager
¶
Bases: AIPerfLoggerMixin
ExporterManager is responsible for exporting records using all registered data exporters.
Source code in aiperf/exporters/exporter_manager.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
aiperf.exporters.json_exporter¶
JsonExportData
¶
Bases: BaseModel
Data to be exported to a JSON file.
Source code in aiperf/exporters/json_exporter.py
23 24 25 26 27 28 29 30 31 | |
JsonExporter
¶
Bases: AIPerfLoggerMixin
A class to export records to a JSON file.
Source code in aiperf/exporters/json_exporter.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
aiperf.metrics.base_aggregate_metric¶
BaseAggregateMetric
¶
Bases: Generic[MetricValueTypeVarT], BaseMetric[MetricValueTypeVarT], ABC
A base class for aggregate metrics. These metrics keep track of a value or list of values over time.
This metric type is unique in the fact that it is split into 2 distinct phases of processing, in order to support distributed processing.
For each distributed RecordProcessor, an instance of this class is created. This instance is passed the record and the existing record metrics, and is responsible for returning the individual value for that record. It should not use or update the aggregate value here.
The ResultsProcessor creates a singleton instance of this class, which will be used to aggregate the results from the distributed
RecordProcessors. It calls the _aggregate_value method, which each metric class must implement to define how values from different
processes are aggregated, such as summing the values, or taking the min/max/average, etc.
Examples:
class RequestCountMetric(BaseAggregateMetric[int]):
# ... Metric attributes ...
def _parse_record(self, record: ParsedResponseRecord, record_metrics: MetricRecordDict) -> int:
# We just return 1 since we are tracking the total count, and this is a single request.
return 1
def _aggregate_value(self, value: int) -> None:
# We add the value to the aggregate value.
self._value += value
Source code in aiperf/metrics/base_aggregate_metric.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | |
current_value
property
¶
Get the current value of the metric.
__init__(default_value=None)
¶
Initialize the metric with optionally with a default value. If no default value is provided, the default value is automatically set based on the value type.
Source code in aiperf/metrics/base_aggregate_metric.py
44 45 46 47 48 49 50 51 52 53 54 55 | |
parse_record(record, record_metrics)
¶
Parse the record and return the individual value.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the metric cannot be computed for the given inputs. |
Source code in aiperf/metrics/base_aggregate_metric.py
62 63 64 65 66 67 68 69 70 71 72 | |
aiperf.metrics.base_derived_metric¶
BaseDerivedMetric
¶
Bases: Generic[MetricValueTypeVarT], BaseMetric[MetricValueTypeVarT], ABC
A base class for derived metrics. These metrics are computed from other metrics, and do not require any knowledge of the individual records. The final results will be a single computed value (or list of values).
NOTE: The generic type can be a list of values, or a single value.
Examples:
class RequestThroughputMetric(BaseDerivedMetric[float]):
# ... Metric attributes ...
def _derive_value(self, metric_results: MetricResultsDict) -> float:
request_count = metric_results[RequestCountMetric.tag]
benchmark_duration = metric_results[BenchmarkDurationMetric.tag]
return request_count / (benchmark_duration / NANOS_PER_SECOND)
Source code in aiperf/metrics/base_derived_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
derive_value(metric_results)
¶
Derive the metric value.
Source code in aiperf/metrics/base_derived_metric.py
33 34 35 36 | |
aiperf.metrics.base_metric¶
BaseMetric
¶
Bases: Generic[MetricValueTypeVarT], ABC
A definition of a metric type.
This class is not meant to be instantiated directly or subclassed directly. It is meant to be a common base for all of the base metric classes by type.
The class attributes are: - tag: The tag of the metric. This must be a non-empty string that uniquely identifies the metric type. - header: The header of the metric. This is the user-friendly name of the metric that will be displayed in the UI. - unit: The unit of the internal representation of the metric. This is used for converting to other units and for display. - display_unit: The unit of the metric that is used for display (if different from the unit). None means use the unit for display. - display_order: The display order in the ConsoleExporter. Lower numbers are displayed first. None means unordered after any ordered metrics. - flags: The flags of the metric that determine how and when it is computed and displayed. - required_metrics: The metrics that must be available to compute the metric. This is a set of metric tags.
Source code in aiperf/metrics/base_metric.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | |
__init_subclass__(**kwargs)
¶
This method is called when a class is subclassed from Metric.
It automatically registers the subclass in the MetricRegistry
dictionary using the tag class attribute.
The tag attribute must be a non-empty string that uniquely identifies the
metric type. Only concrete (non-abstract) classes will be registered.
Source code in aiperf/metrics/base_metric.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | |
has_flags(flags)
classmethod
¶
Return True if the metric has the given flag(s) (regardless of other flags).
Source code in aiperf/metrics/base_metric.py
139 140 141 142 | |
missing_flags(flags)
classmethod
¶
Return True if the metric does not have the given flag(s) (regardless of other flags). It will return False if the metric has ANY of the given flags.
Source code in aiperf/metrics/base_metric.py
144 145 146 147 148 | |
aiperf.metrics.base_record_metric¶
BaseRecordMetric
¶
Bases: Generic[MetricValueTypeVarT], BaseMetric[MetricValueTypeVarT], ABC
A base class for record-based metrics. These metrics are computed for each record, and are independent of other records. The final results will be a list of values, one for each record.
NOTE: Set the generic type to be the type of the individual values, and NOT a list, unless the metric produces a list for every record. In that case, the result will be a list of lists.
Examples:
class InputSequenceLengthMetric(BaseRecordMetric[int]):
# ... Metric attributes ...
# ... Input validation ...
def _parse_record(
self,
record: ParsedResponseRecord,
record_metrics: MetricRecordDict,
) -> int:
return record.input_token_count
Source code in aiperf/metrics/base_record_metric.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
parse_record(record, record_metrics)
¶
Parse a single record and return the metric value.
Source code in aiperf/metrics/base_record_metric.py
38 39 40 41 42 43 44 | |
aiperf.metrics.metric_dicts¶
MetricRecordDict
¶
Bases: dict[MetricTagT, MetricValueTypeT]
A dict of metrics for a single record. This is used to store the current values of all metrics that have been computed for a single record.
This will include:
- The current value of any BaseRecordMetric that has been computed for this record.
- The new value of any BaseAggregateMetric that has been computed for this record.
- No BaseDerivedMetrics will be included.
Source code in aiperf/metrics/metric_dicts.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
get_converted(metric, other_unit)
¶
Get the value of a metric, but converted to a different unit.
Source code in aiperf/metrics/metric_dicts.py
28 29 30 31 32 | |
MetricResultsDict
¶
Bases: dict[MetricTagT, MetricDictValueTypeT]
A dict of metrics over an entire run. This is used to store the final values of all metrics that have been computed for an entire run.
This will include:
- All BaseRecordMetrics as a deque of their values.
- The most recent value of each BaseAggregateMetric.
- The value of any BaseDerivedMetric that has already been computed.
Source code in aiperf/metrics/metric_dicts.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
get_converted(metric, other_unit)
¶
Get the value of a metric, but converted to a different unit.
Source code in aiperf/metrics/metric_dicts.py
46 47 48 49 50 51 52 53 54 55 | |
aiperf.metrics.metric_registry¶
MetricRegistry
¶
A registry for metrics.
This is used to store all the metrics that are available to the system. It is used to lookup metrics by their tag, and to get all the metrics that are available. It also provides methods to get metrics by their type, flag, and to create a dependency order for the metrics. It is also used to create instances of metrics.
This class is not meant to be instantiated directly. It is meant to be used as a singleton via classmethods.
Source code in aiperf/metrics/metric_registry.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 | |
all_classes()
classmethod
¶
Get all of the classes of the defined metric classes.
Source code in aiperf/metrics/metric_registry.py
169 170 171 172 | |
all_tags()
classmethod
¶
Get all of the tags of the defined metric classes.
Source code in aiperf/metrics/metric_registry.py
164 165 166 167 | |
classes_for(tags)
classmethod
¶
Get the classes for the given tags.
Raises:
| Type | Description |
|---|---|
MetricTypeError
|
If a tag is not found. |
Source code in aiperf/metrics/metric_registry.py
174 175 176 177 178 179 180 181 | |
create_dependency_order()
classmethod
¶
Create a dependency order for all available metrics using topological sort.
See :meth:create_dependency_order_for for more details.
Source code in aiperf/metrics/metric_registry.py
237 238 239 240 241 242 243 244 | |
create_dependency_order_for(tags=None)
classmethod
¶
Create a dependency order for the given metrics using topological sort.
This ensures that all dependencies are computed before their dependents.
If tags is provided, only the tags present in tags will be included in the order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tags
|
Iterable[MetricTagT] | None
|
The tags of the metrics to compute the dependency order for. If not provided, all metrics will be included. |
None
|
Returns:
| Type | Description |
|---|---|
list[MetricTagT]
|
List of metric tags in dependency order (dependencies first). |
Raises:
| Type | Description |
|---|---|
MetricTypeError
|
If there are unregistered dependencies or circular dependencies. |
Source code in aiperf/metrics/metric_registry.py
246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 | |
get_class(tag)
classmethod
¶
Get a metric class by its tag.
Raises:
| Type | Description |
|---|---|
MetricTypeError
|
If the metric class is not found. |
Source code in aiperf/metrics/metric_registry.py
106 107 108 109 110 111 112 113 114 115 116 | |
get_instance(tag)
classmethod
¶
Get an instance of a metric class by its tag. This will create a new instance if it does not exist.
Raises:
| Type | Description |
|---|---|
MetricTypeError
|
If the metric class is not found. |
Source code in aiperf/metrics/metric_registry.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 | |
register_metric(metric)
classmethod
¶
Register a metric class with the registry. This will raise a MetricTypeError if the class is already registered.
This method is called automatically via the init_subclass method of the BaseMetric class, so there is no need to call it manually.
Source code in aiperf/metrics/metric_registry.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | |
tags_applicable_to(required_flags, disallowed_flags, *types)
classmethod
¶
Get metrics tags that are applicable to the given arguments.
This method is used to filter the metrics that are applicable to a given set of flags and types. For instance, this can be used to only get all DERIVED metrics, or only get metrics that are applicable to non-streaming endpoints, etc.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
required_flags
|
MetricFlags
|
The flags that the metric must have. |
required |
disallowed_flags
|
MetricFlags
|
The flags that the metric must not have. |
required |
types
|
MetricType
|
The types of metrics to include. If not provided, all types will be included. |
()
|
Returns:
| Type | Description |
|---|---|
list[MetricTagT]
|
A list of metric tags that are applicable to the given arguments. |
Source code in aiperf/metrics/metric_registry.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 | |
aiperf.metrics.types.benchmark_duration_metric¶
BenchmarkDurationMetric
¶
Bases: BaseDerivedMetric[int]
Post-processor for calculating the Benchmark Duration metric.
Formula
Benchmark Duration = Maximum Response Timestamp - Minimum Request Timestamp
Source code in aiperf/metrics/types/benchmark_duration_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
aiperf.metrics.types.benchmark_token_count¶
BenchmarkTokenCountMetric
¶
Bases: BaseAggregateMetric[int]
Post-processor for calculating the Benchmark Token Count metric. This is the total number of tokens processed by the benchmark.
Formula
Benchmark Token Count = Sum of Output Sequence Lengths
Source code in aiperf/metrics/types/benchmark_token_count.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
aiperf.metrics.types.credit_drop_latency_metric¶
CreditDropLatencyMetric
¶
Bases: BaseRecordMetric[int]
Post-processor for calculating Credit Drop Latency metrics from records. This is an internal metric that is intended to be used for debugging and performance analysis of the AIPerf internal system.
It exposes how long it took from when a credit was dropped, to when the actual request was sent. This will include the time it took to query the DatasetManager to get the Turn, as well as the time it took to format the request.
Formula
Credit Drop Latency = Request Start Time - Credit Drop Received Time
Source code in aiperf/metrics/types/credit_drop_latency_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
aiperf.metrics.types.error_request_count¶
ErrorRequestCountMetric
¶
Bases: BaseAggregateMetric[int]
Post-processor for counting the number of error requests.
This metric is only applicable to error records.
Formula
Error Request Count = Sum(Error Requests)
Source code in aiperf/metrics/types/error_request_count.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
aiperf.metrics.types.input_sequence_length_metric¶
InputSequenceLengthMetric
¶
Bases: BaseRecordMetric[int]
Post-processor for calculating Input Sequence Length (ISL) metrics from records.
Formula
Input Sequence Length = Sum of Input Token Counts
Source code in aiperf/metrics/types/input_sequence_length_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
aiperf.metrics.types.input_throughput¶
InputThroughputMetric
¶
Bases: BaseRecordMetric[float]
Post-processor for calculating Input Throughput metrics from records. This is only applicable to streaming responses.
Formula
Input Throughput = Input Sequence Length / Time to First Token (seconds)
Source code in aiperf/metrics/types/input_throughput.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
aiperf.metrics.types.inter_token_latency_metric¶
InterTokenLatencyMetric
¶
Bases: BaseRecordMetric[float]
Post Processor for calculating Inter Token Latency (ITL) metric.
Formula
Inter Token Latency = (Request Latency - Time to First Token) / (Output Sequence Length - 1)
Source code in aiperf/metrics/types/inter_token_latency_metric.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
aiperf.metrics.types.max_response_metric¶
MaxResponseTimestampMetric
¶
Bases: BaseAggregateMetric[int]
Post-processor for calculating the maximum response time stamp metric from records.
Formula
Maximum Response Timestamp = Max(Final Response Timestamps)
Source code in aiperf/metrics/types/max_response_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
aiperf.metrics.types.min_request_metric¶
MinRequestTimestampMetric
¶
Bases: BaseAggregateMetric[int]
Post-processor for calculating the minimum request time stamp metric from records.
Formula
Minimum Request Timestamp = Min(Request Timestamps)
Source code in aiperf/metrics/types/min_request_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
aiperf.metrics.types.output_sequence_length_metric¶
OutputSequenceLengthMetric
¶
Bases: BaseRecordMetric[int]
Post-processor for calculating Output Sequence Length (OSL) metrics from records.
Formula
Output Sequence Length = Sum(Output Token Counts)
Source code in aiperf/metrics/types/output_sequence_length_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
aiperf.metrics.types.output_token_throughput_metric¶
OutputTokenThroughputMetric
¶
Bases: BaseDerivedMetric[float]
Post Processor for calculating Output Token Throughput Metric.
Formula
Output Token Throughput = Benchmark Token Count / Benchmark Duration (seconds)
Source code in aiperf/metrics/types/output_token_throughput_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
aiperf.metrics.types.output_token_throughput_per_user_metric¶
OutputTokenThroughputPerUserMetric
¶
Bases: BaseRecordMetric[float]
Post Processor for calculating Output Token Throughput Per User Metric.
Formula
Output Token Throughput Per User = 1 / Inter-Token Latency (seconds)
Source code in aiperf/metrics/types/output_token_throughput_per_user_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | |
aiperf.metrics.types.request_count_metric¶
RequestCountMetric
¶
Bases: BaseAggregateMetric[int]
Post-processor for counting the number of valid requests.
Formula
Request Count = Sum(Valid Requests)
Source code in aiperf/metrics/types/request_count_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
aiperf.metrics.types.request_latency_metric¶
RequestLatencyMetric
¶
Bases: BaseRecordMetric[int]
Post-processor for calculating Request Latency metrics from records.
Formula
Request Latency = Final Response Timestamp - Request Start Timestamp
Source code in aiperf/metrics/types/request_latency_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
aiperf.metrics.types.request_throughput_metric¶
RequestThroughputMetric
¶
Bases: BaseDerivedMetric[float]
Post Processor for calculating Request throughput metrics from records.
Formula
Request Throughput = Valid Request Count / Benchmark Duration (seconds)
Source code in aiperf/metrics/types/request_throughput_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.metrics.types.ttft_metric¶
TTFTMetric
¶
Bases: BaseRecordMetric[int]
Post-processor for calculating Time to First Token (TTFT) metrics from records.
Formula
TTFT = First Response Timestamp - Request Start Timestamp
Source code in aiperf/metrics/types/ttft_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
aiperf.metrics.types.ttst_metric¶
TTSTMetric
¶
Bases: BaseRecordMetric[int]
Post-processor for calculating Time to Second Token (TTST) metrics from records.
Formula
TTST = Second Response Timestamp - First Response Timestamp
Source code in aiperf/metrics/types/ttst_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.module_loader¶
Module loader for AIPerf.
This module is used to load all modules into the system to ensure everything is registered and ready to be used. This is done to avoid the performance penalty of importing all modules during CLI startup, while still ensuring that all implementations are properly registered with their factories.
ensure_modules_loaded()
¶
Ensure all modules are loaded exactly once.
Source code in aiperf/module_loader.py
47 48 49 50 51 52 53 54 55 56 57 58 | |
aiperf.parsers.inference_result_parser¶
InferenceResultParser
¶
Bases: CommunicationMixin
InferenceResultParser is responsible for parsing the inference results.
Source code in aiperf/parsers/inference_result_parser.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 | |
compute_input_token_count(request_record, tokenizer)
async
¶
Compute the number of tokens in the input for a given request record.
Source code in aiperf/parsers/inference_result_parser.py
180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 | |
configure()
async
¶
Configure the tokenizers.
Source code in aiperf/parsers/inference_result_parser.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
get_tokenizer(model)
async
¶
Get the tokenizer for a given model.
Source code in aiperf/parsers/inference_result_parser.py
90 91 92 93 94 95 96 97 98 99 | |
parse_request_record(request_record)
async
¶
Handle an inference results message.
Source code in aiperf/parsers/inference_result_parser.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 | |
process_valid_record(request_record)
async
¶
Process a valid request record.
Source code in aiperf/parsers/inference_result_parser.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 | |
aiperf.parsers.openai_parsers¶
OpenAIObject
¶
Bases: CaseInsensitiveStrEnum
Types of OpenAI objects.
Source code in aiperf/parsers/openai_parsers.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |
parse(text)
classmethod
¶
Attempt to parse a string into an OpenAI object.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the object is invalid. |
Source code in aiperf/parsers/openai_parsers.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
parse_list(obj)
classmethod
¶
Attempt to parse a string into an OpenAI object from a list.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the object is invalid. |
Source code in aiperf/parsers/openai_parsers.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |
OpenAIResponseExtractor
¶
Extractor for OpenAI responses.
Source code in aiperf/parsers/openai_parsers.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | |
__init__(model_endpoint)
¶
Create a new response extractor based on the provided configuration.
Source code in aiperf/parsers/openai_parsers.py
104 105 106 | |
extract_response_data(record, tokenizer)
async
¶
Extract the text from a server response message.
Source code in aiperf/parsers/openai_parsers.py
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | |
aiperf.post_processors.base_metrics_processor¶
BaseMetricsProcessor
¶
Bases: AIPerfLoggerMixin, ABC
Base class for all metrics processors. This class is responsible for filtering the metrics based on the user config.
Source code in aiperf/post_processors/base_metrics_processor.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
get_filters()
¶
Get the filters for the metrics based on the user config. Returns: tuple[MetricFlags, MetricFlags]: The required and disallowed flags.
Source code in aiperf/post_processors/base_metrics_processor.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
aiperf.post_processors.metric_record_processor¶
MetricRecordProcessor
¶
Bases: BaseMetricsProcessor
Processor for metric records.
This is the first stage of the metrics processing pipeline, and is done is a distributed manner across multiple service instances. It is responsible for streaming the records to the post processor, and computing the metrics from the records. It computes metrics from MetricType.RECORD and MetricType.AGGREGATE types.
Source code in aiperf/post_processors/metric_record_processor.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
process_record(record)
async
¶
Process a response record from the inference results parser.
Source code in aiperf/post_processors/metric_record_processor.py
56 57 58 59 60 61 62 63 64 65 66 | |
aiperf.post_processors.metric_results_processor¶
MetricResultsProcessor
¶
Bases: BaseMetricsProcessor
Processor for metric results.
This is the final stage of the metrics processing pipeline, and is done is a unified manner by the RecordsManager. It is responsible for processing the results and returning them to the RecordsManager, as well as summarizing the results.
Source code in aiperf/post_processors/metric_results_processor.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
process_result(incoming_metrics)
async
¶
Process a result from the metric record processor.
Source code in aiperf/post_processors/metric_results_processor.py
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
summarize()
async
¶
Summarize the results.
This will compute the values for the derived metrics, and then create the MetricResult objects for each metric.
Source code in aiperf/post_processors/metric_results_processor.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
aiperf.records.record_processor_service¶
RecordProcessor
¶
Bases: PullClientMixin, BaseComponentService
RecordProcessor is responsible for processing the records and pushing them to the RecordsManager. This service is meant to be run in a distributed fashion, where the amount of record processors can be scaled based on the load of the system.
Source code in aiperf/records/record_processor_service.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | |
get_tokenizer(model)
async
¶
Get the tokenizer for a given model.
Source code in aiperf/records/record_processor_service.py
105 106 107 108 109 110 111 112 113 114 | |
aiperf.records.records_manager¶
RecordsManager
¶
Bases: PullClientMixin, BaseComponentService
The RecordsManager service is primarily responsible for holding the results returned from the workers.
Source code in aiperf/records/records_manager.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 | |
get_error_summary()
async
¶
Generate a summary of the error records.
Source code in aiperf/records/records_manager.py
325 326 327 328 329 330 331 | |
main()
¶
Main entry point for the records manager.
Source code in aiperf/records/records_manager.py
334 335 336 337 338 339 | |
aiperf.timing.concurrency_strategy¶
ConcurrencyStrategy
¶
Bases: CreditIssuingStrategy
Class for concurrency credit issuing strategy.
Source code in aiperf/timing/concurrency_strategy.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | |
aiperf.timing.config¶
TimingManagerConfig
¶
Bases: AIPerfBaseModel
Configuration for the timing manager.
Source code in aiperf/timing/config.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
from_user_config(user_config)
classmethod
¶
Create a TimingManagerConfig from a UserConfig.
Source code in aiperf/timing/config.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
aiperf.timing.credit_issuing_strategy¶
CreditIssuingStrategy
¶
Bases: TaskManagerMixin, ABC
Base class for credit issuing strategies.
Source code in aiperf/timing/credit_issuing_strategy.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | |
start()
async
¶
Start the credit issuing strategy. This will launch the progress reporting loop, the warmup phase (if applicable), and the profiling phase, all in the background.
Source code in aiperf/timing/credit_issuing_strategy.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
stop()
async
¶
Stop the credit issuing strategy.
Source code in aiperf/timing/credit_issuing_strategy.py
134 135 136 | |
CreditIssuingStrategyFactory
¶
Bases: AIPerfFactory[TimingMode, CreditIssuingStrategy]
Factory for creating credit issuing strategies based on the timing mode.
Source code in aiperf/timing/credit_issuing_strategy.py
191 192 | |
aiperf.timing.credit_manager¶
CreditManagerProtocol
¶
Bases: PubClientProtocol, Protocol
Defines the interface for a CreditManager.
This is used to allow the credit issuing strategy to interact with the TimingManager in a decoupled way.
Source code in aiperf/timing/credit_manager.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
CreditPhaseMessagesMixin
¶
Bases: MessageBusClientMixin, CreditPhaseMessagesRequirements
Mixin for services to implement the CreditManagerProtocol.
Requirements
This mixin must be used with a class that provides: - pub_client: PubClientProtocol - service_id: str
Source code in aiperf/timing/credit_manager.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
publish_credits_complete()
async
¶
Publish the credits complete message.
Source code in aiperf/timing/credit_manager.py
146 147 148 149 150 151 | |
publish_phase_complete(phase, completed, end_ns)
async
¶
Publish the phase complete message.
Source code in aiperf/timing/credit_manager.py
116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |
publish_phase_sending_complete(phase, sent_end_ns, sent)
async
¶
Publish the phase sending complete message.
Source code in aiperf/timing/credit_manager.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
publish_phase_start(phase, start_ns, total_expected_requests, expected_duration_sec)
async
¶
Publish the phase start message.
Source code in aiperf/timing/credit_manager.py
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
publish_progress(phase, sent, completed)
async
¶
Publish the progress message.
Source code in aiperf/timing/credit_manager.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 | |
CreditPhaseMessagesRequirements
¶
Bases: AIPerfLoggerProtocol, Protocol
Requirements for the CreditPhaseMessagesMixin. This is the list of attributes that must be provided by the class that uses this mixin.
Source code in aiperf/timing/credit_manager.py
56 57 58 59 60 61 | |
aiperf.timing.fixed_schedule_strategy¶
FixedScheduleStrategy
¶
Bases: CreditIssuingStrategy
Class for fixed schedule credit issuing strategy.
Source code in aiperf/timing/fixed_schedule_strategy.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
aiperf.timing.request_rate_strategy¶
RequestRateStrategy
¶
Bases: CreditIssuingStrategy
Strategy for issuing credits based on a specified request rate.
Supports two modes: - CONSTANT: Issues credits at a constant rate with fixed intervals - POISSON: Issues credits using a Poisson process with exponentially distributed intervals
Source code in aiperf/timing/request_rate_strategy.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | |
aiperf.timing.timing_manager¶
TimingManager
¶
Bases: PullClientMixin, BaseComponentService, CreditPhaseMessagesMixin
The TimingManager service is responsible to generate the schedule and issuing timing credits for requests.
Source code in aiperf/timing/timing_manager.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
drop_credit(credit_phase, conversation_id=None, credit_drop_ns=None)
async
¶
Drop a credit.
Source code in aiperf/timing/timing_manager.py
183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
main()
¶
Main entry point for the timing manager.
Source code in aiperf/timing/timing_manager.py
202 203 204 205 206 | |
aiperf.ui.base_ui¶
BaseAIPerfUI
¶
Bases: ProgressTrackerMixin, WorkerTrackerMixin
Base class for AIPerf UI implementations.
This class provides a simple starting point for a UI for AIPerf components.
It inherits from the :class:ProgressTrackerMixin and :class:WorkerTrackerMixin
to provide a simple starting point for a UI for AIPerf components.
Now, you can use the various hooks defined in the :class:ProgressTrackerMixin and :class:WorkerTrackerMixin
to create a UI for AIPerf components.
Example:
@AIPerfUIFactory.register("custom")
class MyUI(BaseAIPerfUI):
def __init__(self, **kwargs):
super().__init__(**kwargs)
@on_records_progress
def _on_records_progress(self, records_stats: RecordsStats):
'''Callback for records progress updates.'''
pass
@on_requests_phase_progress
def _on_requests_phase_progress(self, phase: CreditPhase, requests_stats: RequestsStats):
'''Callback for requests phase progress updates.'''
pass
@on_worker_update
def _on_worker_update(self, worker_id: str, worker_stats: WorkerStats):
'''Callback for worker updates.'''
pass
Source code in aiperf/ui/base_ui.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
aiperf.ui.no_ui¶
NoUI
¶
Bases: AIPerfLifecycleMixin
A UI that does nothing.
Implements the :class:AIPerfUIProtocol to allow it to be used as a UI, but provides no functionality.
NOTE: Not inheriting from :class:BaseAIPerfUI because it does not need to track progress or workers.
Source code in aiperf/ui/no_ui.py
11 12 13 14 15 16 17 18 19 20 | |
aiperf.ui.tqdm_ui¶
ProgressBar
¶
A progress bar that can be updated with a progress percentage.
Source code in aiperf/ui/tqdm_ui.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
close()
¶
Close the progress bar.
Source code in aiperf/ui/tqdm_ui.py
54 55 56 | |
update(progress)
¶
Update the progress bar with a new progress percentage.
Source code in aiperf/ui/tqdm_ui.py
46 47 48 49 50 51 52 | |
TQDMProgressUI
¶
Bases: BaseAIPerfUI
A UI that shows progress bars for the records and requests phases.
Source code in aiperf/ui/tqdm_ui.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
aiperf.workers.credit_processor_mixin¶
CreditProcessorMixin
¶
Bases: CreditProcessorMixinRequirements
CreditProcessorMixin is a mixin that provides a method to process credit drops.
Source code in aiperf/workers/credit_processor_mixin.py
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 | |
CreditProcessorMixinRequirements
¶
Bases: AIPerfLoggerProtocol, Protocol
CreditProcessorMixinRequirements is a protocol that provides the requirements needed for the CreditProcessorMixin.
Source code in aiperf/workers/credit_processor_mixin.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
CreditProcessorProtocol
¶
Bases: Protocol
CreditProcessorProtocol is a protocol that provides a method to process credit drops.
Source code in aiperf/workers/credit_processor_mixin.py
29 30 31 32 33 34 35 36 37 | |
aiperf.workers.worker¶
Worker
¶
Bases: PullClientMixin, BaseComponentService, ProcessHealthMixin, CreditProcessorMixin
Worker is primarily responsible for making API calls to the inference server. It also manages the conversation between turns and returns the results to the Inference Results Parsers.
Source code in aiperf/workers/worker.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | |
aiperf.workers.worker_manager¶
WorkerManager
¶
Bases: BaseComponentService
The WorkerManager service is primary responsibility to manage the worker processes. It will spawn the workers, monitor their health, and stop them when the service is stopped. In the future it will also be responsible for the auto-scaling of the workers.
Source code in aiperf/workers/worker_manager.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | |
WorkerStatusInfo
¶
Bases: WorkerStats
Information about a worker's status.
Source code in aiperf/workers/worker_manager.py
33 34 35 36 37 38 39 40 41 42 43 44 | |
aiperf.zmq.dealer_request_client¶
ZMQDealerRequestClient
¶
Bases: BaseZMQClient, TaskManagerMixin
ZMQ DEALER socket client for asynchronous request-response communication.
The DEALER socket connects to ROUTER sockets and can send requests asynchronously, receiving responses through callbacks or awaitable futures.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ DEALER │───── Request ─────>│ ROUTER │ │ (Client) │ │ (Service) │ │ │<─── Response ──────│ │ └──────────────┘ └──────────────┘
Usage Pattern: - DEALER Clients send requests to ROUTER Services - Responses are routed back to the originating DEALER
DEALER/ROUTER is a Many-to-One communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQDealerRouterProxy for more details.
Source code in aiperf/zmq/dealer_request_client.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Dealer (Req) client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/zmq/dealer_request_client.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
request(message, timeout=DEFAULT_COMMS_REQUEST_TIMEOUT)
async
¶
Send a request and wait for a response up to timeout seconds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
Message
|
The request message to send. |
required |
timeout
|
float
|
Maximum time to wait for a response in seconds. |
DEFAULT_COMMS_REQUEST_TIMEOUT
|
Returns:
| Name | Type | Description |
|---|---|---|
Message |
Message
|
The response message received. |
Raises:
| Type | Description |
|---|---|
CommunicationError
|
if the request fails, or |
TimeoutError
|
if the response is not received in time. |
Source code in aiperf/zmq/dealer_request_client.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
request_async(message, callback)
async
¶
Send a request and be notified when the response is received.
Source code in aiperf/zmq/dealer_request_client.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
aiperf.zmq.pub_client¶
ZMQPubClient
¶
Bases: BaseZMQClient
The PUB socket broadcasts messages to all connected SUB sockets that have subscribed to the message topic/type.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ │ SUB │ ┌──────────────┐ │ (Subscriber) │ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ └──────────────┘ OR ┌──────────────┐ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ │ PUB │ └──────────────┘ │ (Publisher) │ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ └──────────────┘ └──────────────┘
Usage Pattern: - Single PUB socket broadcasts messages to all subscribers (One-to-Many) OR - Multiple PUB sockets broadcast messages to a single SUB socket (Many-to-One)
- SUB sockets filter messages by topic/message_type
- Fire-and-forget messaging (no acknowledgments)
PUB/SUB is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQXPubXSubProxy for more details.
Source code in aiperf/zmq/pub_client.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Publisher client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/zmq/pub_client.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
publish(message)
async
¶
Publish a message. The topic will be set automatically based on the message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
Message
|
Message to publish (must be a Message object) |
required |
Source code in aiperf/zmq/pub_client.py
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
aiperf.zmq.pull_client¶
ZMQPullClient
¶
Bases: BaseZMQClient
ZMQ PULL socket client for receiving work from PUSH sockets.
The PULL socket receives messages from PUSH sockets in a pipeline pattern, distributing work fairly among multiple PULL workers.
ASCII Diagram: ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ PUSH │ │ PULL │ │ PULL │ │ (Producer) │ │ (Worker 1) │ │ (Worker 2) │ │ │ └─────────────┘ └─────────────┘ │ Tasks: │ ▲ ▲ │ - Task A │─────────────┘ │ │ - Task B │───────────────────────────────────┘ │ - Task C │─────────────┐ │ - Task D │ ▼ └─────────────┘ ┌─────────────┐ │ PULL │ │ (Worker N) │ └─────────────┘
Usage Pattern: - PULL receives work from multiple PUSH producers - Work is fairly distributed among PULL workers - Pipeline pattern for distributed processing - Each message is delivered to exactly one PULL socket
PULL/PUSH is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQPushPullProxy for more details.
Source code in aiperf/zmq/pull_client.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
__init__(address, bind, socket_ops=None, max_pull_concurrency=None, **kwargs)
¶
Initialize the ZMQ Puller class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
max_pull_concurrency
|
int
|
The maximum number of concurrent requests to allow. |
None
|
Source code in aiperf/zmq/pull_client.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
register_pull_callback(message_type, callback)
¶
Register a ZMQ Pull data callback for a given message type.
Note that only one callback can be registered for a given message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message_type
|
MessageTypeT
|
The message type to register the callback for. |
required |
callback
|
Callable[[Message], Coroutine[Any, Any, None]]
|
The function to call when data is received. |
required |
Raises: CommunicationError: If the client is not initialized
Source code in aiperf/zmq/pull_client.py
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
aiperf.zmq.push_client¶
MAX_PUSH_RETRIES = 2
module-attribute
¶
Maximum number of retries for pushing a message.
RETRY_DELAY_INTERVAL_SEC = 0.1
module-attribute
¶
The interval to wait before retrying to push a message.
ZMQPushClient
¶
Bases: BaseZMQClient
ZMQ PUSH socket client for sending work to PULL sockets.
The PUSH socket sends messages to PULL sockets in a pipeline pattern, distributing work fairly among available PULL workers.
ASCII Diagram: ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ PUSH │ │ PULL │ │ PULL │ │ (Producer) │ │ (Worker 1) │ │ (Worker 2) │ │ │ └─────────────┘ └─────────────┘ │ Tasks: │ ▲ ▲ │ - Task A │─────────────┘ │ │ - Task B │───────────────────────────────────┘ │ - Task C │─────────────┐ │ - Task D │ ▼ └─────────────┘ ┌─────────────┐ │ PULL │ │ (Worker 3) │ └─────────────┘
Usage Pattern: - Round-robin distribution of work tasks (One-to-Many) - Each message delivered to exactly one worker - Pipeline pattern for distributed processing - Automatic load balancing across available workers
PUSH/PULL is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQPushPullProxy for more details.
Source code in aiperf/zmq/push_client.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Push client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/zmq/push_client.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
push(message)
async
¶
Push data to a target. The message will be routed automatically based on the message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
Message
|
Message to be sent must be a Message object |
required |
Source code in aiperf/zmq/push_client.py
106 107 108 109 110 111 112 113 114 115 | |
aiperf.zmq.router_reply_client¶
ZMQRouterReplyClient
¶
Bases: BaseZMQClient
ZMQ ROUTER socket client for handling requests from DEALER clients.
The ROUTER socket receives requests from DEALER clients and sends responses back to the originating DEALER client using routing envelopes.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ DEALER │───── Request ─────>│ │ │ (Client) │<──── Response ─────│ │ └──────────────┘ │ │ ┌──────────────┐ │ ROUTER │ │ DEALER │───── Request ─────>│ (Service) │ │ (Client) │<──── Response ─────│ │ └──────────────┘ │ │ ┌──────────────┐ │ │ │ DEALER │───── Request ─────>│ │ │ (Client) │<──── Response ─────│ │ └──────────────┘ └──────────────┘
Usage Pattern: - ROUTER handles requests from multiple DEALER clients - Maintains routing envelopes to send responses back - Many-to-one request handling pattern - Supports concurrent request processing
ROUTER/DEALER is a Many-to-One communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQDealerRouterProxy for more details.
Source code in aiperf/zmq/router_reply_client.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Router (Rep) client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/zmq/router_reply_client.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | |
register_request_handler(service_id, message_type, handler)
¶
Register a request handler. Anytime a request is received that matches the message type, the handler will be called. The handler should return a response message. If the handler returns None, the request will be ignored.
Note that there is a limit of 1 to 1 mapping between message type and handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
service_id
|
str
|
The service ID to register the handler for |
required |
message_type
|
MessageTypeT
|
The message type to register the handler for |
required |
handler
|
Callable[[Message], Coroutine[Any, Any, Message | None]]
|
The handler to register |
required |
Source code in aiperf/zmq/router_reply_client.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | |
aiperf.zmq.sub_client¶
ZMQSubClient
¶
Bases: BaseZMQClient
ZMQ SUB socket client for subscribing to messages from PUB sockets. One-to-Many or Many-to-One communication pattern.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ │ SUB │ ┌──────────────┐ │ (Subscriber) │ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ └──────────────┘ OR ┌──────────────┐ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ │ PUB │ └──────────────┘ │ (Publisher) │ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ └──────────────┘ └──────────────┘
Usage Pattern: - Single SUB socket subscribes to multiple PUB publishers (One-to-Many) OR - Multiple SUB sockets subscribe to a single PUB publisher (Many-to-One)
- Subscribes to specific message topics/types
- Receives all messages matching subscriptions
SUB/PUB is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQXPubXSubProxy for more details.
Source code in aiperf/zmq/sub_client.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Subscriber class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/zmq/sub_client.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | |
subscribe(message_type, callback)
async
¶
Subscribe to a message_type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message_type
|
MessageTypeT
|
MessageTypeT to subscribe to |
required |
callback
|
Callable[[Message], Any]
|
Function to call when a message is received (receives Message object) |
required |
Source code in aiperf/zmq/sub_client.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
subscribe_all(message_callback_map)
async
¶
Subscribe to all message_types in the map. For each MessageType, a single callback or a list of callbacks can be provided.
Source code in aiperf/zmq/sub_client.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
aiperf.zmq.zmq_base_client¶
BaseZMQClient
¶
Bases: AIPerfLifecycleMixin
Base class for all ZMQ clients. It can be used as-is to create a new ZMQ client, or it can be subclassed to create specific ZMQ client functionality.
It inherits from the :class:AIPerfLifecycleMixin, allowing derived
classes to implement specific hooks.
Source code in aiperf/zmq/zmq_base_client.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
socket_type_name
property
¶
Get the name of the socket type.
__init__(socket_type, address, bind, socket_ops=None, client_id=None, **kwargs)
¶
Initialize the ZMQ Base class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to BIND or CONNECT the socket. |
required |
socket_type
|
SocketType
|
The type of ZMQ socket (eg. PUB, SUB, ROUTER, DEALER, etc.). |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/zmq/zmq_base_client.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
aiperf.zmq.zmq_comms¶
BaseZMQCommunication
¶
Bases: BaseCommunication, AIPerfLoggerMixin, ABC
ZeroMQ-based implementation of the CommunicationProtocol.
Uses ZeroMQ for publish/subscribe, request/reply, and pull/push patterns to facilitate communication between AIPerf components.
Source code in aiperf/zmq/zmq_comms.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
create_client(client_type, address, bind=False, socket_ops=None, max_pull_concurrency=None, **kwargs)
¶
Create a communication client for a given client type and address.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client_type
|
CommClientType
|
The type of client to create. |
required |
address
|
CommAddressType
|
The type of address to use when looking up in the communication config, or the address itself. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
False
|
socket_ops
|
dict | None
|
Additional socket options to set. |
None
|
max_pull_concurrency
|
int | None
|
The maximum number of concurrent pull requests to allow. (Only used for pull clients) |
None
|
Source code in aiperf/zmq/zmq_comms.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
get_address(address_type)
¶
Get the actual address based on the address type from the config.
Source code in aiperf/zmq/zmq_comms.py
50 51 52 53 54 | |
ZMQIPCCommunication
¶
Bases: BaseZMQCommunication
ZeroMQ-based implementation of the Communication interface using IPC transport.
Source code in aiperf/zmq/zmq_comms.py
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |
__init__(config=None)
¶
Initialize ZMQ IPC communication.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ZMQIPCConfig | None
|
ZMQIPCConfig object with configuration parameters |
None
|
Source code in aiperf/zmq/zmq_comms.py
118 119 120 121 122 123 124 125 126 | |
ZMQTCPCommunication
¶
Bases: BaseZMQCommunication
ZeroMQ-based implementation of the Communication interface using TCP transport.
Source code in aiperf/zmq/zmq_comms.py
99 100 101 102 103 104 105 106 107 108 109 110 | |
__init__(config=None)
¶
Initialize ZMQ TCP communication.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ZMQTCPConfig | None
|
ZMQTCPTransportConfig object with configuration parameters |
None
|
Source code in aiperf/zmq/zmq_comms.py
104 105 106 107 108 109 110 | |
aiperf.zmq.zmq_defaults¶
TOPIC_DELIMITER = '.'
module-attribute
¶
The delimiter between topic parts. This is used to create an inverted hierarchy of topics for filtering by service type or service id.
For example: - "command" - "system_controller.command" - "timing_manager_eff34565.command"
TOPIC_END = '$'
module-attribute
¶
This is used to add to the end of each topic to prevent the topic from being a prefix of another topic. This is required for the PUB/SUB pattern to work correctly, otherwise topics like "command_response" will be received by the "command" subscriber as well.
For example: - "command$" - "command_response$"
TOPIC_END_ENCODED = TOPIC_END.encode()
module-attribute
¶
The encoded version of TOPIC_END.
ZMQSocketDefaults
¶
Default values for ZMQ sockets.
Source code in aiperf/zmq/zmq_defaults.py
29 30 31 32 33 34 35 36 37 38 39 40 | |
aiperf.zmq.zmq_proxy_base¶
BaseZMQProxy
¶
Bases: AIPerfLifecycleMixin, ABC
A Base ZMQ Proxy class.
- Frontend and backend sockets forward messages bidirectionally
- Frontend and Backend sockets both BIND
- Multiple clients CONNECT to
frontend_address - Multiple services CONNECT to
backend_address - Control: Optional REP socket for proxy commands (start/stop/pause) - not implemented yet
- Monitoring: Optional PUB socket that broadcasts copies of all forwarded messages - not implemented yet
- Proxy runs in separate thread to avoid blocking main event loop
Source code in aiperf/zmq/zmq_proxy_base.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 | |
__init__(frontend_socket_class, backend_socket_class, zmq_proxy_config, socket_ops=None, proxy_uuid=None)
¶
Initialize the ZMQ Proxy. This is a base class for all ZMQ Proxies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frontend_socket_class
|
type[BaseZMQClient]
|
The frontend socket class. |
required |
backend_socket_class
|
type[BaseZMQClient]
|
The backend socket class. |
required |
zmq_proxy_config
|
BaseZMQProxyConfig
|
The ZMQ proxy configuration. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
proxy_uuid
|
str
|
An optional UUID for the proxy instance. If not provided, a new UUID will be generated. This is useful for tracing and debugging purposes. |
None
|
Source code in aiperf/zmq/zmq_proxy_base.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
from_config(config, socket_ops=None)
abstractmethod
classmethod
¶
Create a BaseZMQProxy from a BaseZMQProxyConfig, or None if not provided.
Source code in aiperf/zmq/zmq_proxy_base.py
138 139 140 141 142 143 144 145 146 | |
ProxySocketClient
¶
Bases: BaseZMQClient
A ZMQ Proxy socket client class that extends BaseZMQClient.
This class is used to create proxy sockets for the frontend, backend, capture, and control endpoint types of a ZMQ Proxy.
Source code in aiperf/zmq/zmq_proxy_base.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
aiperf.zmq.zmq_proxy_sockets¶
ZMQDealerRouterProxy = define_proxy_class(ZMQProxyType.DEALER_ROUTER, create_proxy_socket_class(SocketType.ROUTER, ProxyEndType.Frontend), create_proxy_socket_class(SocketType.DEALER, ProxyEndType.Backend))
module-attribute
¶
A ROUTER socket for the proxy's frontend and a DEALER socket for the proxy's backend.
ASCII Diagram: ┌───────────┐ ┌──────────────────────────────────┐ ┌───────────┐ │ DEALER │<───>│ PROXY │<────>│ ROUTER │ │ Client 1 │ │ ┌──────────┐ ┌──────────┐ │ │ Service 1 │ └───────────┘ │ │ ROUTER │<─────> │ DEALER │ │ └───────────┘ ┌───────────┐ │ │ Frontend │ │ Backend │ │ ┌───────────┐ │ DEALER │<───>│ └──────────┘ └──────────┘ │<────>│ ROUTER │ │ Client N │ └──────────────────────────────────┘ │ Service N │ └───────────┘ └───────────┘
The ROUTER frontend socket receives messages from DEALER clients and forwards them through the proxy to ROUTER services. The ZMQ proxy handles the message routing automatically.
The DEALER backend socket receives messages from ROUTER services and forwards them through the proxy to DEALER clients. The ZMQ proxy handles the message routing automatically.
CRITICAL: This socket must NOT have an identity when used in a proxy configuration, as it needs to be transparent to preserve routing envelopes for proper response forwarding back to original DEALER clients.
ZMQPushPullProxy = define_proxy_class(ZMQProxyType.PUSH_PULL, create_proxy_socket_class(SocketType.PULL, ProxyEndType.Frontend), create_proxy_socket_class(SocketType.PUSH, ProxyEndType.Backend))
module-attribute
¶
A PULL socket for the proxy's frontend and a PUSH socket for the proxy's backend.
ASCII Diagram: ┌───────────┐ ┌─────────────────────────────────┐ ┌───────────┐ │ PUSH │─────>│ PROXY │─────>│ PULL │ │ Client 1 │ │ ┌──────────┐ ┌──────────┐ │ │ Service 1 │ └───────────┘ │ │ PULL │──────>│ PUSH │ │ └───────────┘ ┌───────────┐ │ │ Frontend │ │ Backend │ │ ┌───────────┐ │ PUSH │─────>│ └──────────┘ └──────────┘ │─────>│ PULL │ │ Client N │ └─────────────────────────────────┘ │ Service N │ └───────────┘ └───────────┘
The PULL frontend socket receives messages from PUSH clients and forwards them through the proxy to PUSH services. The ZMQ proxy handles the message routing automatically.
The PUSH backend socket forwards messages from the proxy to PULL services. The ZMQ proxy handles the message routing automatically.
ZMQXPubXSubProxy = define_proxy_class(ZMQProxyType.XPUB_XSUB, create_proxy_socket_class(SocketType.XSUB, ProxyEndType.Frontend), create_proxy_socket_class(SocketType.XPUB, ProxyEndType.Backend))
module-attribute
¶
An XSUB socket for the proxy's frontend and an XPUB socket for the proxy's backend.
ASCII Diagram: ┌───────────┐ ┌─────────────────────────────────┐ ┌───────────┐ │ PUB │───>│ PROXY │───>│ SUB │ │ Client 1 │ │ ┌──────────┐ ┌──────────┐ │ │ Service 1 │ └───────────┘ │ │ XSUB │──────>│ XPUB │ │ └───────────┘ ┌───────────┐ │ │ Frontend │ │ Backend │ │ ┌───────────┐ │ PUB │───>│ └──────────┘ └──────────┘ │───>│ SUB │ │ Client N │ └─────────────────────────────────┘ │ Service N │ └───────────┘ └───────────┘
The XSUB frontend socket receives messages from PUB clients and forwards them through the proxy to XPUB services. The ZMQ proxy handles the message routing automatically.
The XPUB backend socket forwards messages from the proxy to SUB services. The ZMQ proxy handles the message routing automatically.
create_proxy_socket_class(socket_type, end_type)
¶
Create a proxy socket class using the specified socket type. This is used to reduce the boilerplate code required to create a ZMQ Proxy class.
Source code in aiperf/zmq/zmq_proxy_sockets.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
define_proxy_class(proxy_type, frontend_socket_class, backend_socket_class)
¶
This function reduces the boilerplate code required to create a ZMQ Proxy class. It will generate a ZMQ Proxy class and register it with the ZMQProxyFactory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
proxy_type
|
ZMQProxyType
|
The type of proxy to generate. |
required |
frontend_socket_class
|
type[BaseZMQClient]
|
The class of the frontend socket. |
required |
backend_socket_class
|
type[BaseZMQClient]
|
The class of the backend socket. |
required |
Source code in aiperf/zmq/zmq_proxy_sockets.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |